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Abstract 


The p\irpose of this treport is to exp.lore a number of current 
research directions in the fields of digital signal processing and 
modem control and estimation theory. We examine topics such as 
stability theory, linear prediction and parameter identification, 
system synthesis and implementation, two-dimensional filtering, 
decentralized control and estimation, image processing, and non- 
linear system theory, in order to uncover some of the basic simi- 
larities and differences in the goals, techniques, and philosophy 
of the two disciplines. An extensive bibliography is included in 
the hope that it will allow the interested reader to delve more 
deeply into some of these interconnections than is possible in 
this survey. 
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OUTLINE 


Introductxon : Point of View, Goals, and Overview 

Section A: Stability Analysis 

'Basic Stability Problems in Both Disciplines 

1. Limit cycles caused by the effects of finite 
arithmetic in digital filters. 

2. Analysis of feedback control systems 

Subsection A.l: The Use of Lyapunov Theory 

1. Basic I»yapunov theory for nonlinear and linear 
systems. 

2. Uses in digital filter analysis 

a. Bounds on limit cycle magnitude 

b. Pseudopower as a Lyapunov function 
for wave digital filters 

3. Uses in control theoiy 

a. Stability of optimal linear-quadratic 
controllers and dinear estimators. 

b. Use in obtaining more explicit stabi- 
lity criteria. 

Subsection A. 2: Frequency Domain Criteria, .Passivitg^,, and 

Lyapunov Functions 

a. The use of passivity concepts to study 
feedback systems. 

b. Frequency domain stability criteria arising 
from the study of passive systems, sector 
nonlinearities , and positive real functions . 

c. Analogous results for the absence of limit 
cycles in digital filters . 

d. Eelationship between input/output stability 
and internal stability. 

e. Generation of Lyapimov functions for 
dissipative systems 

Speculation: The effect of* finite arithmetic on digitally- 

inplemented feedback control systems. 
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Section B: Parameter Identification, Linear Prediction, Least 

Squares , and Kalman Filtering 

Basic Problems in Both Disciplines 

1. System identification and its uses in problems 
of estimation, control, and adaptive systems, 

2. Parametric modeling of speech for digital pro- 
cessing applications; the all-pole model and 
the linear prediction formulation. 

Subsection B.l; The Autocorrelation Method, Kalman Filtering 
for Stationary Processes, and Past Algorithms 

1. Derivation of the Toeplits normal equations for 
the autocorrelation method for linear predic- 
tion; stochastic inte3:pretation. 

2. Interpretation of predictor coefficients for dif- 
ferent order predictors as the time-varying 
weighting pattern of the optimal predictor. 

3. The Kalman filter as a realization of the optimal 
predictor for autocorrelations which come from 
linear, constant coefficient state equations. 

4. Levinson's fast algorithm for solving the normal 
equations, and its relation to fast methods 

for confuting the Kalman gain. 

Subsection B.2: The Covariance Method, Recursive Least 

Squares Identification, and Kalman Filters 

1. Derivation of the normal equations for the 
covariance method for linear prediction; sto- 
chastic interpretation. 

2. Fast algorithms for the solution of the normal 
equations . 

3. Derivation of the recursive form of the solution, 
and Its interpretation as a Kalman filter. 

4. Speculation on the use of this formulation to 
track time-varying speech parameters. 

Subsection B.3; Design of a Predictor as a Stochastic 
Realization Problem 

1. The stochastic realization problem and its 
relationship to spectral factorization. 
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2. The xnnovatxons representatxon and the two-step 
fast algorxthm for determnxng the optxmal predxctor 
the potentxal use of thxs method for xdentxi^xng 
pole-zero models. 

3, EJcamxnatxon of the numerxcal aspects of the sto- 
chastxc realxzatxon problem and the dxfference xn 
xntent between thxs problem and parametrxc model 
fxttxng. 

Subsectxon B.4: Some Other Issues xn System Identxfxcatxon 

1, Other Kalman fxlter methods for approxxmate least 
squares and maxxmvim Ixkelxhood xdentxfxcatxon of 
pole-zero models. 

2, The use of cepstral analysis to xdentxfy pole- 
zero models. 

Sectxon C: Synthesxs, Eealizatxon, and Implementatxon 

Subsection C.l; State Space Realxzatxons and State Space 
Desxgn Technxques 

1. Fundamentals of realxzatxon theory; controllabxlxty, 
observabxlxty , and mxnxmalxty . 

2. Use of realxzatxon technxques xn order to apply 
multxvarxable state space desxgn algorxthras and 
analysxs technxques; exan: 5 )les of observer and 
estxmator desxgn and sensitxvxty and covarx ance 
analysxs. 

Stibsectxon C.2s The Implementation of Dxgital Systems and 
Filters 

1. Basxc xssues xnvolved xn dxgxtal system desxgn. 

2. Revxew of several desxgn technxques. 

3. Issues xnvolved in the xn 5 )lementatxon of a 
fxlter usxng fxnxte precxsion arxthmetxc; 
miniraalxty, computational complexity, coef- 
ficient sensitivity, and quantization effects. 

4. Basxc technxques for implementing FIR and 
IIR filters, 

5. State space realizations and fxlter struc- 
tures; the inadequacy of state space methods 
for specifying all structures. 
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6, Use of state space methods to analyze the sensiti-vity 
and roundoff noise behavior of digital filter 
structures . 

Sxibsection C.3; Direct Design Taking Digital Implementation Into 
Accoiint 

1. Issues involved in the digital implementation of 
optimal control systems; the effect of algorithm 
complexity on allowable sampling rates. 

2. Designs that are amenable to modular, parallel, 
or distributed processing. 

Section D; Multiparameter Systems, Distributed Processes, and 
Random Fields 

Subsection D.lr Two Dimensional Systems and Filters 

1. Two-dimensional shift-invariant linear systems; 
convolution, transforms, and difference equations. 

2. Computational considerations for FIR and HR 
filters; recursibility, quadrant and half-plane 
causality, precedence relations and partial orders. 

3. storage requirements and their relation to 
boundary conditions and the range of 2-D 
space considered. 

4. Half-plane 2-D causality and its relation to 1-D 
distributed or multivariable systems and decen- 
tralized decision making. 

5. Processing of 2-D data by 1-D techniques using 
projections or scan ordering. 

6. Stability for recursive 2 -d systems; algebraic 
techniques and problems caused by nonfacto- 
rability of multivariable polynomials. 

7. Stabilization and spectral factorization to break 
systems into stable quadrant or half— plane pieces. 

8. Problems with 2-D least-squares inverse design. 

9. Use of 1-D structures and design techniques for 
2-D systems; separable systems, rotated systems, 
and McClellan transformations. 

10. Extension of 1-D design techniques to 2 -d. 

11. State space models in 2-D; local and global 
state realizations and relations to recursible 
2-D systems. 
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12. Speculation concerning the extension of state 
space techniques, such as Lyapunov and covariance 
analysis, to the analysis of 2 -d systems. 

13. Relations between 2-D linear systems and certain 
1-D nonlinear systems. 

Subsection D.2: Image Processing, Random Fields, and Space- 

Time Systems 

1. Discussion of the image formation process and 
the point-spread function. 

2. Models for recorded and stored images; density 
and intensity images. 

3. The image as a random field with first and 
second order statistics. 

4. Representation and coding of images; Karhunen- 
Loeve representation and the circulant approxi- 
mation for fast processing of images with 
stationary statistics. 

5. Nonrecursive restoration techniques 

a. Inverse filter 

b . Wiener filter 

c. constrained least squares 

d. Geometric mean filter 

6. Recursive restoration techniques 

a. 1-D Kalman filtering of the scanned 
image 

b. 2 -d Kalman filtering for images mo- 
deled using half-plane shaping filters 

c. Efficient optimal estimation for 
separable 2-D systems 

d. Reduced-update, subop timal linear 
filtering 

e. Transform techniques for efficient 
optimal processing of systems des- 
cribed by "nearest neighbor" and 
"semicausal" stochastic difference 
equations. 

7. Discussion of limitations of and questions raised 
by recursive and nonrecursive restoration techniques 

a. Need for a prion model of the image 

b. Limitations of recursive shaping filter 
image models 
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c« Incorporation of image blur into 
restoration schemes 

d. Effects of image sensing nonlinearities 

e. Positivity constraints on estimated 
intensities 

f. Resolution-noise suppression tradeoff, 
contrast enhancement, and edge detection, 

8, Statistical and probabilistic models of random fields 

a. Markov field models and interpolative 
filter models 

b. Two-dimensional linear prediction 

c. Statistical inference on random 
fields; maximum likelihood para- 
meter estimation 

d. Multidimensional stochastic calculus 
and martingales. 

9. Space-time processes and multivariable 1-D systems 

a. Seismic signal processing problems; 
velocity and delay-time analysis as 
2-D problems 

b. Use of 1-D and 2 -d recursive sto- 
chastic techniques to solve space- 
time signal processing problems 

c. Large scale systems as 2 -d systems; 
reinterpretation of 2 -d image 
processing techniques as efficient 
centralized and decentralized esti- 
mation systems for multivariable 
1-D systems. 

Section E: Some Issues in Nonlinear Systems Analysis; Homomorphic 

Filtering, Bilinear Systems, and Algebraic System Theory 

Basics Concepts of Homomorphic Filtering 
Multiplicative Homomorphic Systems as a Special Case 
of Bilinear Systems 

Optimal Estimation for Bilinear Systems 
Other Algebraic Techniques for the Analysis of 
Nonlinear Systems 
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Concluding Remarks 

Appendix 1; A Lyapunov Ptmction Argument for the Limit Cycle 
Problem in a Second-Order Filter 

Appendix 2; The Discrete Fourier Transform and Circulant Matrices 
References 
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Introduction; Point of VxeWy Goals ^ and Overvxew 

This report has grown out a series of discussions over the past 
year between the author and Prof. Alan V. Oppenheim of M.l.T. These 
talks were motivated by a mutual belief that there were enough similari- 
ties and differences in our philosophies, goals, and analytical tech- 
niques to indicate that a concerted effort to understand these better 
might lead to some useful interaction and collaboration. In addition, 
it became clear after a short while that one could not accoitplish this 
by trying to understand the two fields in the abstract. Rather, we felt 
that it was best to examine several specific topics in detail in order 
to develop this understanding, and it is out of this study that this 
report has emerged. 

Thus the goal of this report is to ejq^Iore several directions of 
current research in the fields of digital signal processing and modern 
control and estimation theory. Our examination will in general not be 
result-oriented. ' Instead, we are most interested in understanding the 
goals of the research and the methods and approach used. Understanding 
the goals may help us to see why the techniques used in the two disci- 
plines differ. Inspecting the methods and approaches may allow one to 
see areas in vdiich concepts in one field may be usefully applied in the 
other. The report undoubtedly has a control-oriented flavor, since^ it 
reflects the author’s background and also since the original purpose of 
this study was to present a control-theorist’s point of view at the 1976 
Arden House Workshop on Digital Signal Processing, However, an effort 
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has been made to ejcplore avenues xn both discxplxnes xn order to encourage 
researchers xn the two fxelds to continue along these Ixnes. 

It xs hoped that the above comments will help explaxn the spxrxt xn 
which thxs report has been wrxtten. In readxng through the report, the 
reader may fxnd many comments that are exther partxally or totally unsub— 
stantxated or that are much too black and white. These points have been 
xncluded xn keeping with the speculatxve nature of the study. However, 
we have attempted to provxde background for our speculation and have 
Ixmxted these comments to questxons whxch we feel represent excitxng 
oppoirtunxties for interaction and collaboration. Clearly these xssues 
must be studxed at a far deeper level than xs possxble xn thxs initxal 
survey-orxented effort. Also, we have not been so presumptuous as to 

I 

attempt to define the two fields (although some may feel we come danger- 
ously close) , sxnce we feel that a valxd mutual understandxng can and 
wxll grow out of closer examination of the directxons we descrxbe. To 
this end, we have xncluded an extensxve bxblxography which should help, 
the xnterested reader to make xnroads xnto the varxous areas. 

The followxng xs an annotated Ixst of the topxcs consxdered xn the 

4 

followxng sectxons. Sectxons are denoted by capxtal letters, and, for 
ease of reference, the bxblxography xs coded sxmxlcirly (e.g. , [A-21] xs 
the 21st reference for Sectxon A — Stabxlxty Analysxs) . Due to varxatxons 
xn the author's expertxse, maturxty of the subject areas, and nature of 
the questxons, the sectxons vairy greatly xn depth and style. Some sec- 
tions are very specxfxc, while others are more philosophxcal and speculatxve.. 
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a. Stabxlity Ana3jysxs — In thxs sectxon we dxscuss methods used 
xn both dxsciplxnes for the study of stabxlxty characterxstxcs 
of systems. In digital sxgnal processing one xs primarxly con- 
cerned wxth the possxbilxty of Ixmxt cycles caused by the effects 
of finxte arithmetxc xn dxgital filters. In control theory, one 
is' often concerned wxth determining conditions for stability of 
feedback systems. The techniques used in the two disciplines 
have many similarities. Lyapunov theory, frequency domain 
methods, and the concept of passivity are widely used by 
researchers in both fields. We speculate on a potential research 
topic — the effects of finite arithmetic on digitally imple- 
mented feedback control systems. 

B. Parameter Identification, Linear Prediction, Least Squares, and 
Kalman Filtering -- Identification of parametric models arises 
in a variety of problems, from digital processing of speech to 
adaptive control. Using the speech problem as a focus, we 
explore several methods for identification. We examine the 
autocorrelation method for linear prediction and relate it to 
the determination of the time-varying weighting pattern of an 
optimum predictor. We eilso discuss the efficient Levinson 
algorithm and its relationship to recently developed fast algor- 
ithms for determining optimum time-varying Kalman filter gams. 
The covariance method for linear prediction is discussed, as are 
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xts relationshxps wxth the Kalman fxlter structure of recursxve 
least squares. Using thxs framework, we speculate on potentxal 
recursive methods for identifyxng time-varyxng models for speech. 
We also dxscuss the relatxonshxp between the parametrxc xdentx- 
fxcation problem and the problem of stochastxc realxzatxon. 
Crucxal dxfferences xn the underlyxng assunptxons are brought 
out, and we speculate on the utxlity of a stochastic realxzatxon 
approach for the xdentxficatxon of pole-zero models of speech. 

We also discuss other pole-zero xdentxfxcation techniques inclu- 
dxng recursive maxxmum likelxhood methods, whxch resemble recur- 
sive least squres (and hence the covarxance method) both xn form 
and spxrxt. 

C. Synthesxs, Realxzatxon, and Implementatxon — We dxscuss state 

space models and realxzation theory and the uses of such realxza- 
txons for dxrect synthesxs and for "xndxrect synthesxs", xn 
whxch a state space model of a process of xnterest allows one 
to apply state space methods to synthesxze systems for estxma-= 
txon, stabxlxzatxpn, optxmal control, etc. We also explore the 
key xssues xnvolved xn the desxgn of dxgxtal fxlters meetxng 
certaxn desxgn specxficatxons . We dxscuss several fxlter desxgn 
methods, but the major emphasxs of our examxnatxon of thxs 
topxc xs on fxlter structures. Mxnxmalxty -- the key concept xn 
state space realxzatxon theory — xs only one of several xssues. 
Sensxtxvxty and behavior in the presence of perturbatxons caused 
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by finxte arxthmetic are crucxal questxons as well. Here we 
find some Ixmxtatxons of state space methods. All mxnxmal 
structures cannot be obtaxned from straxghtforward algorxthmxc 
xnterpretatxons of dxfferent state space realxzatxons , We specu- 
late on some recent work xndxcatxng that state space methods 
may be useful xn analyzing the performance of dxfferent struc- 
tures, that certaxn factorxzatxons of state space realxzatxons 
include all structures, and that state realxzatxons combxned 
wxth an understandxng of structures xssues may lead to useful 
xmplementations for multi varxable fxlters. Fxnally, we specu- 
late on the possxbxlxty of desxgnxng controllers, fxlters, 
or other systems by dxrectly takxng the constraxnts of dxgital 
xit5>lementatxon xnto account from the start. Thxs area contaxns 
some xntrxguxng, potentxally very useful, and extremely diffxcult 
problems . 

D. Multxparameter Systems, Dxstrxbuted processes, and Random 
Pxelds — We ej^lore a number of the xssues that arxse xn 
studyxng systems defxned wxth two or more xndependent varxables. 
We see that the xssues of recursxon, causalxty, and the 
sequencxng of the reguxred computatxons for a fxlter become ex- 
tremely complxcated xn thxs settxng. We find that a precedence 
relation among the computations exists and xs of the same form 
and spirit as the precedence relation arising xn multx-decxsion- 
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maker control problems, A number of relationshxps with one- 
diraensional concepts are explored, specificcilly a multidimen- 
sional system can be made into a (often cjuite complex) one- 
dimensional system by totally ordering the computations in a way 
that IS compatible with the precedence relation. We also dis- 
cuss the possibility of transforming distributed or multivariable 
systems to scalar, multidimensional systems, and we speculate 
on the utility of such an approach. Ihe algebraic difficulties 
that arise in multidimensional problems lead to complications 
in areas such as stability analysis and spectral factorization, 
and we also point out that similar algebraic problems arise in 
considering lumped-distributed systems, certain time-varying 
systems, and specific classes of nonlinear systems. A number 
of design methods are discussed, and many of these are closely 
related or in fact rely on one-dimensionai methods. We ^also 
describe a number of state space models for multidimensional 
systems, and we run into many of the same difficulties ■ — 
causality, nonfactorizabilit^, etc. We speculate on the utility 
of state models for stability and roundoff noise analysis and 
for multidimensional recursive Kalman filtering. We discuss a 
number of statistical and probabilistic approaches to multi- 
dimensional filtering and analyze their utility in the context 
of the problem of image processing. We also speculate on the 
utility of the two-dimensional stochastic iffamework for the 
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consxderation of space- time and decentxalized control problems. 
This section offers some of the most exciting and difficult 
potential research directions. 

E. Some Issues in Nonlinear System Analysis s Homomorphic Filtering, 
Bilinear Systems, and Algebraic System Theory — There has been 
substantial vrork in both disciplines in analyzing and synthesizing 
nonlinear dynamic systems that possess certain types of algebraic 
structure. We consider the work in digital signal processing on 
homomorphic systems and filter design, and we relate this to 
some work on state space models that possess related algebraic 
properties . 


Finally, we make some concluding remarks, summing up our feelings 
about the relationship of the two fields and the possibility of increased 
interaction. From the point of view of Prof. Oppenheim and the author, 
this study has been a success, since we are convinced of the benefit of 
such interaction. This report will be a success if we can convince 


others. 



-18- 


A. stabxlity Analysis 

Of all of the topics that we have investigated, it is in this area that 
we have found some of the clearest areas of intersection and interaction 
between the disciplines. In the field of digital signal processing, stability 
issues arise when one considers the consequences of finite word length in 
digital filters, Two problems arise (not mentioning the effects due to finite 
accuracy in filter coefficients [A-12,C“1]). On the one hand, a digital filter 
necessarily has finite range, and thus overflows can occur, while on the 
other, one is inevitably faced with the problem of numerical quantization - — 
roundoff or truncation. Since the filter has finite range (it is after all a 
finite-state machine) the question of the state of the filter growing without 
bound IS irrelevant. However, the nonlinearites in the filter, introduced 
by whatever form of fir ite arithmetic is used, can cause zero-input limit 
cycles and can also lead to discrepancies between the ideal and actual res- 
ponse of the filter to certain inputs. Following the discussions in [A-3,l5], 
the typical situation with which one is concerned is depicted in Figure A.l, 

The filter is described (in state-variable form) by equations of the form 

x(n+l) = Ax(n) + Bu(n) 
y(n) = Cx(n) 

X(n) =N(x(n)) (a'.l) 

where N is a nonlinear, memoryless function that accounts for the effects of 
overflow and quantization. If these effects were not present — i.e. if N 
were the identity function — equation (A.l) would reduce to a linear equation. 
If one assumes that this associated linear system is designed to meet certain 
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specif ications, one would like to know how the nonlineaxity N affects overall 
performance. In particular, one important question ist assuming that the 
linear system is asymptotically stable, can the nonlinear system (A.l) sustain 
iindriven oscillations, and will its response to inputs deviate significantly 
from the response of the linear system? We will make a few remarks about this 
question in a moment. We refer the reader to the survey papers [A-3,5] and 
to the references for more detailed descriptions of known results. 

In control theory the question of system stability has long played a 
central role in the design and analysis of feedback systems. Following 
[A-42] , a typical feedback system, depicted in Figure A. 2, is described by 
the functional equations 



where u^, u^, e^, e^, y^^, and y^ are functions (of time — discrete or 
continuous) and and are operators (possibly nonlinear) describing the 
dynamics of the forward and feedback paths, respectively. In control theory 
one is interested either in the analysis or the synthesis of such systems. 

In the synthesis problem one is given an open loop system and is asked to 

define a feedback system (A. 2) such that the overall system has certain 
desirable stability properties, in the case of stability analysis, with which 

we are most concerned here, one may be interested either in the driven or 
the ^lndrlven (u^=0) characteristics, in the driven case one wishes to determine, 
for example [A-42] , if bounded inputs lead to bounded outputs and if the input- 
output relationship is continuous ~ i.e. if small changes in the u's lead to 
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small changes in the y's. In the undriven case, one wishes to determine 
if the system response decays, remains bounded, or diverges when the only 
perturbing influences are initial conditions. Again, the literature in 
this area is quite extensive, and we refer the reader to the texts 
[A-42,44,47] , the survey paper [A-43] , and to the references for more on 
these problems. 

From the above descriptions one gets a clear indication about some of 

the similarities and differences in the two topics^. In both areas one 

wants the answeiS to some qualitative questions — is the system stable; is 

it asymptotically stable; is the system continuous tA-42] or does it exhibit 

"]umps" ■vdien one makes small changes in the inputs [A-32,43], In addition, 

one often wants some quantitative answers. In digital filter design one is 

often interested in determining bounds on the magnitudes of limit cycles 

and in finding out how many bits one needs to keep the magnitudes of such 

oscillations within tolerable limits. In the study of feedback control 

systems one is interested in measures of stability as provided by quantities 

such as damping ratios and eigenvalues (poles) . In addition, one is often 

interested in the shapes of these modes — i.e. in determining the state 

2 

eigenvector corresponding to a particular eigenvalue. 


^One of the most trivial of these is the fact that control theorists put minus 
signs in their feedback loops, while there are none in the nonlinear digital 
filter of Figure 1. The reader should be careful to make the proper changes 
of sign in switching between results, 

2 

This is of interest, for example, in the design of stability augmentation 
systems for aircraft. In this case one is quite interested in the shape of 
modes such as "Dutch roll", which involves both the bank: and sideslip angles 
of the aircraft [A-71] , 
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In addxtxon to the sxmxlar goals of the two problem areas, as we shall 
see, people xn each area have obtaxned results by drawxng from very sxmxlar 
bags of mathematxcal trxcks. However, there are dxfferences between the 
methods used and results obtaxned xn the two areas. In the analysxs of dx- 
gxtal fxlters the work has been characterxzed by the study of systems contaxnxng 
quxte specifxc nonlxnearxtxes. In addxtxon, much of the work has dealt wxth 
specxfxc filter structures. In partxcular, second-order fxlters have recexved 
a great deal of attentxon [A-2, 3, 11, 15, 18, 31] sxnce more complex fxlters can 
be built out of serxes - parallel xnterconnectxons of such sectxons. Also, 
the class of wave digxtal filters [A-6,7,8,9,10] have been studxed xn some 
detaxl, Studxes xn these areas have yxelded extremely detaxled descrxptxons 
of regions of stability xn parameter space (see, for example, [A-3] ) and 
numerous upper and lowar bounds on limit cycle magnitudes (see [A-3, 4, 20, 26,31, 
35,56,59,60,633) , 

In control theory, on the other hand, the recent trend has been xn the 
development of rather general theories, concepts, and techniques for sta- 
bility analysis. A number of rather powerful mathematxcal techniques have 
been developed, but there has not been as much attentxon paid to obtaining 
tight bounds for specxfxc problems. In addxtxon, problems involving limit 
cycles have not recexved nearly as much attentxon xn recent years as issues 
such as bounded-input, bounded-output stability and global asymptotic stability 
(although there clearly is a relationship between these issues and limit cycles) 

In the rest of this section, we. briefly discuss , the relationship between 
some of the results in the two fields. Our aim here is to point out areas in 
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which researchers have used srmlar techniques, obtained similar results, 
or relied on similar concepts. 

A.l The Use of Lyapunov Theory 

The technique of constructing Lyapunov functions to prove the stability 
of dynamical systems has been used by researchers in both fields. The basic 
ideas behind Lyapunov theory are the following {see [a- 47,48,52,64] for 
details and further discussions) : consider the dynamical system 

x{k+l) “ f{x(k)), f(0)=0 (A.3) 

(where x is a vector) , Suppose we can find a function V(x) such that 
V(0)=0 and the first difference along solutions satisfies 

AV(x) ^ V(f{x)) - V(x) fW(x)<0 (A.4) 

Such a function is called a Lyapunov function. If this function has some 
additional properties, we can prove stabilii^ or instability of (A.3). 
Examples are (see [A-47,48] for proofs) : 

Theorem A.l ; Suppose V is such that 

(i) It IS positive definite — i.e, these exists a continuous, 
nondecreasing scalar function a, such that a(0)=0 and 

V(x) ^a(|x|)>0 (A.5) 

(ii) a(]xj)^ when |x|-»<» (A.6) 

(ill) Av IS negative definite — i.e. there exists a continuous, 
nondecreasing scalar function y, such that 

Av(x) £ - Y( lx] )<0 


(A.7) 
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Then all solutions of (A, 3) converge to 0, 

In this result, we can think of V as an "energy" function, and (A.5) , 

(A, 6) essentially state the intuitive idea that the larger the system state, 
the more energy that is stored in it. With this interpretation, the theorem 
states that if the system dissipates energy (equation {A.6)), the state will 
converge to 0. If we allow ourselves to consider "energies" which can take 
on negative values, we can get instability results, such as 

Theorem a. 2 ; Suppose V satisfies (A. 4) and suppose there exists an such 
that V(Xq)<0. Then the system is not asymptotically stable in the large 
since the solution starting at x^ does not converge to 0. 

t 

The point here is that since energy decreases, once we arrive at a ne- 
gative energy state, we can never reach the zero energy state. 

As mentioned earlier, Lyapunov stability has been used by many researchers. 

A crucial advantage of Lyapunov-type results is that the hypotheses for re- 
sults such as Theorems A.l and A. 2 can be checked using the f uncti on V and f 
only — i.e. one does not have to construct explicit solutions to difference 
or differential equations. However, the mag or problem with the theory is 
the difficulty in finding Lyapunov functions in general. For linear systems, 
however, a theory exists, and one can always find a quadratic Lyapunov function 

V(x) = x'Qx (A.8) 

that will determine if the system is asymptotically stable (in fact a constructive 
procedure using the Lyapunov equation [A-47,48] can be used). For nonlinear 
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systems the construction of Lyapunov functions is much more difficult (see 
[A-47,48] for several techniques). 

With respect to the limit cycle problem, Willaon [A-2,13] has utilized 
Lyapunov functions (and essentially Theorem 1) to determine conditions under 
which second order digital filters will not have overflow limit cycles and 
will respond to "small" inputs in a manner that is asymptotically close to 
the ideal response. Parker and Hess [A— 26] and Johnson and Lack [A— 59,60] 
have used Lyapunov functions to obtain bounds on the magnitude of limit 
cycles. In each of these the Lyapunov function used was- a quadratic form 
which in fact proved asymptotic stability for the ideal linear system. 

In Willson's work [a- 13] , he was able to show that his results were in some 
sense tight by constructing counterexamples when his condition was violated. 

In [A-26,59,60] the bounds are not as good as others that have been found, 
and, as Parker and Hess state, this may be due to the difficulty of determining 
which quadratic Lyapunov function to use. As pointed out by claasenr et.al., 
[A-3] , it appears to be difficult to find appropriate Lyapunov functions for 
the discontinuous nonlinearities that characterize quantization (see 
Appendix 1 for an example of the type of result that one can find) . 

There is a class of digital filters — wave digital filters (WDP) [A-6,7, 
8,9,10] — for which one can use Lyapunov techniques to prove stability. Such 
filters have been developed by Fettweis so that they possess many of the 
properties of classical analog filters. Motivated by these analogies, 

Fettweis [A-8] defines the notion of "instantaneous pseudopower", which is a 
particular quadratic form in the state of the WDF. By defining the notion of 
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"pseudopassxvity" of such a filter, Fettwexs xntroduces (in a very natural 
way for this setting) the notion of dissipativeness. With this framework, the 
pseudopower becomes a natural candidate for a Lyapunov function, and in [a— 10] , 
Fettweis and Meerkotter are able to apply standard Lyapunov arguments to obtain 
quite reasonable conditions on numerical operations that guarantee the asym- 
ptotic stability of pseudopassive WDF's. The introduction of the concept of 
dissipativeness in the study of stability is an often-used idea (see the note 
of Eesoer [A-36]),and a number of important stability results have as their 
basis (at least from some points of view) some notion of passivity. We will 
have a bit more to say about this in the next subsection. We note here that 
the use of passivity concepts and the tools of Lyapunov theory appear to be 
of some value in the development of new digital filter structures that behave 
well in the presence of quantization. As an example, we refer the reader to 
the recent paper lA-11] in which a new second order filter structure is de- 
veloped and analyzed using pseudopower-Lyapunov arguments, 

Lyapxinov concepts have found numerous applications in control theory. 
Detailed studies of their use in system analysis are described in the important 
paper of Kalman and Bertram tA-48] and the texts [A-47] , [a- 52] , and [A-64] . 

As mentioned earlier the construction of quadratic Lyapunov equations for linear 
systems is well understood and is described in detail in these texts. The key 
result, in this area is the following: 

Theorem A. 3 ; Consider the discrete-time system 


x (k+1) = Ax (k) 


(A. 9) 
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This system is asymptotically stable (i.e. all of the eigenvalues of A lie 
inside the unit circle in the complex plane) if and only if for any positive 
definite matrix L, the solution Q of the (discrete) Lyapunov equation 

A’QA - Q = -L (A. 10) 

IS also positive definite. In this case the function 

V(x) = x'Qx (A.ll) 

is a Lyapunov function satisfying the hypotheses of Theorem A,1 — i.e. it 
proves the asun^totic stability of (A. 9). 

The equation (A, 10) and its continuous-time analog (see [A-47] ) arise in 
several contexts in control theory, and we will mention it again later in a 
different setting. Also, note that Theorem A. 3 provides a variety of choices 
for Lyapunov functions(we can choose any L>0 in (A.IO)). Parker and Hess 
[A. 26] obtain bounds on the magnitude of limit cycles by choosing L=I (here 
(A. 9) represents the ideal linear model). Tighter bounds might be possible 
with other choices of L, but, as they mention, it is not at all clear how one 
would go about finding a "better" choice (other than by trial and error) . We 
also refer the reader to the paper of Kalman and Bertram [A-48] in which they 
use Lyapunov techniques to bound the magnitude of solutions of difference 
equations perturbed by nonlinearities. 

For specific applications of Lyapunov theory to linear and nonlinear 
systems, we refer the reader to the references or to the literature (in par- 
ticular the IEEE Transactions on Automatic Control) » In the remainder of 


this subsection we concentrate on another use of Lyapunov concepts — as 
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intermediate steps in the development of other results in control theory. 

An example of this occurs in the analysis of optimal control and estimation 
systems [A-64,65,66,67] . Consider the linear system 


x(k+l) = Ax{k) + Bulk) 
y(k) = Cx(k) 


(A. 12) 


and suppose we wish to find the control u that minimises the cost 


00 


J 



y'(i)y(i) + u' (i)u(i) 


(A. 13) 


This IS a special case of the output regulator problem [A-66] . Here the cost 
(A.13) represents a tradeoff between regiilation of the output (the y’y term) 
and the conservation of control energy (the u’u tern). The following is < 
the solution for a particular case: 


Theorem A. 4 ; Suppose the system (A. 12) is completely controllable (any 
state ccin be reached from any other state by application of an appropriate 
input sequence) and completely observable (the state can be uniquely deter- 
mined from knowledge of the input and output sequences) . Then the optimal 
control in feedback form is 


u(k) = -(R+B'KB)“^ B*KA x(k) (A. 14) 

where K is the unique positive definite solution of the algebraic Riccati 
equation 

K = A'KA+C'C - A'KB(R+B'KB)“^ B'KA 


(A.15) 
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One proof of this result proceeds along the following lines. Suppose 
we are presently in the state x. We can then define the optimal cost to go . 
V(x),as the< minimum 'of J in (A..13) when we start in state x. With the aid of 
dynamic programming methods [a- 66] , one can show that V has the form 

V(x) = x'Kx (A.16) 

where K satisfies (A. 15) . The finiteness of V is proven using controllability, 
while observability guarantees that if xj^O, then y and u cannot both be 
identically zero and thus J>0. As a final important question, consider the 
closed loop system (A.12), (a. 14) . As discussed in [A-66] one can show that 
this system is asymptotically stable, and, in fact, the cost-to-go function 
V(x) IS a Lyapunov function which proves this result. Observability and 
controllability (and somewhat weaker counterparts — detectability and stabi— 
lizability) are important concepts in the development of this result and may 
others. In fact, the concept of observability allows one to prove [A-51] , 

Theorem A. 5 : Consider the system (A, 9) and the function V(x) = x*Qx* 

Suppose 

(i) Q>0 

(il) V(Ax)-V(x) = x' [A'QA-Qlx^x'C'Cx 

(ill) The system (A. 9) is observable from the output 
y(k) = Cx(k) 


Then (A. 9) is asymptotically stable. 
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Comparxng Theorems A. 3 and A. 5, we see that we have replaced the 
negative definiteness of A'QA-Q with negative semidefinitess and an observa- 
bility condition. The intuitive idea is the following: negative definiteness 

makes it clear that V(x{k)) strictly decreases along solutions whenever x(k)5»^0, 
and from this we can deduce asymptotic stability; negative semidefiniteness 
only says V does not increase. However, is it possible that V can xemain 
stationary indefinitely at a non-^ero value? The answer is no, since it it 
did, we would be able to conclude that Cx( 3 )= 0 , j=k, k+1, k+2,... ., eind obser- 
vability would require x(k)=0. Thus V must decrease (not necessarily at 
every single step') , and we can again deduce asymptotic stability. 

Thus,, we see that Lyapunov concepts, when combined .with ideas from the 
theory of state-space models, *can lead to important results concerning optimal 
designs of controllers .and estimators. See [A-64,65,,66y6,7] for continuous 
time analogs of these results and dual results for estimators (the reader is 
also advised to examine [A-68] in which the interplay of many ^of these ideas 
IS discussed') . 

In addition to i±s use in studyaaig design mathods such as the regulator 
problem, Lyapunov theory has been .used as a iramework for the development of 
many more ejqjlicit stability criteria (recall, Lyapunov theory in principle 
requires a search for an appropriate function) . Exan^iles of these are a number 
of the frequency domain stability 'criteria that have been developed in the last 
10 to 15 years (see [A-1, 21, 22 ,23,24,33,37,38,39,43,44,45] . Several of these 
results have analogs for the iimit cycle problem. For example, Tsypkin's 
criteria [A-33,21,2] and [A-44, p.l94] , which are analogs of the circle and 
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Popov criteria in continuous time (see [A-43,44]), have coimterparts in the 
theory of limit cycles [A-15,16] . We note also that instability counterparts 
of the Tsypkin-Popov type of result have been developed from a Lyapunov point 
of view [A-1,39], and a thorough understanding of the basis for these results 
may lead to analogous results for limit cycles in digital filters. 

We defer further discussion of these results to the next subsection, in 
which we are interested in examining the interplay among a number of stability 
concepts (passivity, Lyapunov, Tsypkin, frequency domain analysis, positive 
real functions, etc.). The key point is that many stability results can and 
have been derived in a number of different ways, and an examination of these 
various derivations reveals an interrelationship between the various methods 
of stability analysis. Some of the most fundamental work that has been done 
in this area has been accomplished by J.C. Willems [A-49,50,69] , and the reader 
is referred to his work for a more thorough treatment of these issues and for 
further references. 

A. 2 Frequency Domain Criteria, Passivity, and Lyapunov Functions 

We have already mentioned that the notion of passivity is of importance 
in stability theory and have seen that Fettweis and Meerkotter have been able 
to utilize passivity notions to study certain digital filters via Lyapunov 
techniques. The relationship between passivity, Lyapunov functions, and 
many of the frequency domain criteria of stability theory is quite deep, and 
in this subsection we wish to illustrate some of these ideas. The interested 
reader is referred to the references for more details. 

In recent years the concept of passivity has become one of the fundamental 
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notions in the study of feedback stability. This notion, which is very imich an 
input/output concept, is developed in detail by J.C. Willems [A^1,42,50,69] . 

3 

We follow [A-42,69] , Let U and Y be input and output sets, respectively, and 
let U and / be sets of functions from a time set T into U and y (T may be con~ 
tinuous or discrete, as discussed [A-69] ) . Let G: U-^-y be a dynamic system, 
mapping input functions ueu into output functions Gu£Y (we assume that G is 
a causal map [A-69J ) . Intuitively, stability means that small inputs lead to 
small outputs, and the following makes this precise. 

Definition A, I t Let (I , be subspaces of (/ and y , respectively (these are 
our "small signals") . The system G is I/O stable if u£(^ implies Gue/. 
Furthermore, if (i, / are normed spaces, then G is finite gain I/O stable if 
there exists K<“ such that 


Gu £ K u 


(A.17) 


A typical example is the case T=positive integers, U=y= all real sequences 
of numbers, U=y= all square-siimmable secjuences. In this case I/O stability 
means (y=Gu) 


00 



1=1 


<00 => 



<00 


(A.18) 


Our development is< by no means complete, as' our intention is to relate several 
ideas and not to prove theorems. Thus, the reader is referred to the references 
(in particular to [A-42] ) for a thorough treatment and for precise statements of 
the results described here (for example, we have not included a discussion of 
system well-posedness, which bears some similarities to the constraints on 
feedback paths imposed by Fettweis in his development of wave digital filters). 
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and fxnxte-gain I/O stabxlxty means 




K <—>1 < 


“ 2 \ 1/2 




(A.19) 


x=l 


x=l 


x=l 


Note one property of thxs example. Let be the operator 


(P^x) (t) = 


x{t) t<T 
0 t>T 


(A.20) 


Then for any uetf, ye/, we have ’P^mbU, P^bV. In thxs case (i, / are called 

( causal) extensxons of U, /, and we assume thxs to be the case from now on. 

We now can defxne passxve systems. 

Defxnxtxon A. 2 ; Let U=V, and assxirae that xs an inn6r product space. Then 
G xs passxve xf 


>0 

T T — 


■yueUfteT 


(Ac 21) 


and strictly passxve xf there xs an e>0 such that 


<P^u, P^Gu> > eI | p^u| !' 


(A. 22) 


In terms of our example, G xs passxve xf and only xf (y=Gu) 
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I 

-•=1 


u y >0 
11 ^ 


yu^,N 


.(a, 23) 


and IS strictly passive if and only if there exists an e>0 such that 



u y > e 
11 — 



u ,N 
1 


(A.24) 


Much as with Fettweis's pseudopassive blocks, passive systems can be 
interconnected in feedback arrangements and remain passive. The following 
result is of this type, and it, in fact, is one of the cornerstones of feed- 
back stability theory [A-69] . 

Iheorem A. 6 ? Consider the feedback system of Figure A. 2 with all inputs eind 
outputs elements of the same space U (for simplicity) . The feedback system 
is strictly passive and finite gain I/O stable if 

(i) IS strictly passive and finite-gain input/output 

stable 

(ii) is passive 

As outlined by J.C. Willems m [A-69], there are three basic stability ’ 
principles — the one above, the small loop gain theorem (stability arises 
if the gams of G^ and are each less than unity — a result used in the 
digital filter context in [A-72] and the next result, which depends upon 


the following 
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Definition A»3 ; Same conditions on U , as in Definition A. 2. Let a^ 
be given real mnnbers. Then G is inside (outside) the sector [a,b] if 

<(G-aI)u, (G-*bI)u> <0 (X)) VusU (A.25) 

It IS strictly inside the sector [a,b] if there exists an e>0 such that 

<(G-al)u, (G-bl)u> <-e||u|i^ (>el[uli^) 'VuetT (a.26) 


We now state a variation of Willems* third stability condition (see 
[A-42]). 


Theorem A, 7 ; Consider the feedback system of Figure A. 2, This system is 
finite gain stable if G 2 is Lipschitz continuous — i.e. 


I 1 ^2 VS'^2 1 1 i ! V^2 * i * 

and if for some a;^b>0r G^ is strictly inside the sector [a,b] , I + ~(a+b)G^ 
has a causal inverse on U (not necessarily U ) , and G^ satisfies; 

11 

( 1 ) a<0 => G, IS inside the sector [- — —3 on U 

1 D 

( 11 ) a>0 => G, IS outside the sector [- — ^3 on U 

1 a b 

(iii) a=0 => G- + ^ I IS passive on U, 


As develop by J.C. Willems [a- 42, 69], this result leads to the circle 
criterion (in continuous time) . Let us examine the third case in the Theorem 
in order to sketch the derivation of one of Tsypkin’s criteria. Consider the 
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system in Figure A, 3. Here is a memory less nonlinearity, and we assume 
that f IS in the sector [0,k] . We also take tl= all square summable sequences. 
The System is a linear time— invariant system characterized by the transfer 
function G(z), which we assume to be stable. Condition (iii) of Theorem A. 7 
then says that (G^ + must be passive on U , and, as developed in [A-42,69], 

this will be the case if and only if G(z) + ^ is positive real: 
nw !L 

ReCGCe-' )) + - ^0 Vwe[0r2ir) (A.27> 

* \c 

which is precisely Tsypkin's condition [A-33] . The fact that 1 + j G is 

invertible can be obtained by analogy with the continuous time results in 
[A-42, Chapter 5J (m fact, this result is a simple consequence of the l^quist 
criterion when we observe that G is stable and take (A. 27) into account). 

Consider the feedback system in Figure A. 2. It is clear that the input- 
output behavior of this system is the same as that for the system in Figure A. 4, 
where M and N cire operators (not necessarily causal). As discussed in Ia- 42,44] , 
one can often find appropriate multipliers so that the modified forward and 
feedback systems satisfy the criteria of Theorem A. 7. This is in fact the basis 
for Popov's criterion [a- 37] , for its generalizations [A-38,39,40,42,43,44,45] , 
and for Tsypkin's discrete-time version [a-23,44] , 

Consider a nonlinear feedback system as in Figure A. 3 but in continuous- 
time (i.e. replace G(z) with G(s) ) , and again suppose f is strictly inside the 
sector [0 ,k] . Using the multipliers 

1 


N=I, M = 


1+OS 
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we can show that the feedback path xs also strxctly insa.de the sector [0,kl 
and hence the modified forward loop must satisfy a passivity condition. 
Specifically, we obtain Popov's condition (see [A-38] ) that the feedback 
system is finite gain I/O stable if G is stable (all poles in the left-hand 
plane) and if (i+as)G(s) + ^ is positive real for some a^O — i.e. if 

Re[(l+a3w)G(3w)] +^^0 Vw 

To obtain Tsypkin's result [A-23,43] , we must in addition assume that f is 
nondecreasing. In this case, the discrete-time system is finite gain I/O 
stable if there exists a^O such that 

Re[(l+a(l-e"^''))G(e^'^)] + ^0 we[0,2lt) (A.28) 

As mentioned earlier, a number of extensions of Popov's criterion in 
continuous- time are available, and we refer the reader to [A-42,44,45] and 
in particular to [a-38]. As we shall see, some of the results on digital 
filter limit cycles resexable Tsypkin-type criteria. 

Sector nonlinearity characteristics play a major role in the study of 
digital filter limit cycles (see in particular [A-15] ) . Specifically, con- 
sider the roundoff quantizer in Figure A. 5, This function is inside the 
sector [0,2] (see [A-3,lS] for other quantizers and their sector characteristics) . 
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Using simply the sector nature of a nonlinearity, Claasen, et.al. [A-15] 
prove the following 

Theorem A. 8 ; Consider the feedback system of Figure A^3, where f is in the 
sector [0,k], Then limit cycles of period N are absent if 


Re(G(e 


^ )) + i >0 


(A.29) 


for ^0,1,,.., N-1. 


If one also takes the nondecreasing nature of f into account, we obtain 
[A-15] ; 


Theorem A. 9 ; if f is inside the sector C0,k] and also is nondecreasing, 

then limit cycles of period N are absent from the system of Figure A. 3 if 

there exist a >0 such that 
P- 


Re 


1 + 


N-1 

I 


p=l 


a 

p 


(l-e327TAp/Nj 


G(e 


j2ir5,/Nj 



.(A.30) 


If we take a to be the only nonzero a , we obtain the condition 
N— 1 p 

derived by Barkin [A-16] which is quite similar to Tsypkin's criterion (A.28) . 
Note also the relationship between (A.29) and (A. 27). The proofs given in 
[A-15] rely heavily on the passivity relations (A.29) , (A.30) . Theorem A. 8 
then follows from an application of Parseval’s theorem in order to contradict 
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the existence of a limit cycle of period N. This last step involves the 
assumed periodicity in a crucial way, but the application of Parseval and 
the use of the positive real relationship (A. 29) is very reminiscent of 
stability arguments in feedback control theory [ft- 42], in the proof of 
Theorem A. 9, the monotonicity of f is used in conjunction with a version of 
the rearrangement inequality [A-40,42]. 

Theorem A. 10: Let {x } and {y } be two seauences of real numbers that are 

■ n n 

similarly ordered -i.e. 


X <x => y <y 
n— m ■'n— m 


(A.3i) 


Then if tt is any permutation 


Vx y 
/ n n 


n 


> 


2=17 (n)^n 


•{A.32) 


Corollary [A-40] : Jf f is a -monotone function, -then .for ^.any aaquence 

{x^} and any permutat-ion tt 


2 f (x ) [x -X , . ] “>0 
, n n TT(n) — 


(A. 33) 
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We note that Theorem A. 9 bears some resemblance to the multiplier-type results 
of Popov and Tsypkin. In addition, Willems and Brockett [A-40,42] utilize the 
rearrangement inequality to obtain a general multiplier stability result for 
discrete-time systems with single monotone nonlinearities . A thorough under- 
standing of the relationships among these results would be extremely useful, as 
it might lead to new results on nonexistence of limit cycles. In addition, 
Claasen, et.al. [A-15] have developed a further improvement over (A.30) if f is 
in addition antisymmetric (f(-x) = -f{x)), and have devised linear programming 
techniques to search for the coefficients in (A.30), This algorithmic con- 
cept may prove to be of use in developing search techniques for other, more 
complex multipliers. Also, Cook [A-70] has recently reported several criteria 
for the absence of limit cycles in continuous time systems. His results bear a 
strong relationship to those of Claasen, et.al,, [A-15 I . In particular, passivity 
conditions and Parseval's theorem are used in very similar ways in the two 
papers. 

We now turn our attention to the relationship between input/output concepts 
and questions of internal-stability, (i.e. the response to initial conditions) . 
Intuitively, if we have an internal, state space representation of a system with 
specific input/output behavior G (wirh G(0)=0), we clearly cannot deduce 
asymptotic stability from input/output stability without some conditions on the 
state space realization. For example, the map G=0 is input/output stable but 
the realizations 

x(t) =x(t), yCt) = x(t) (A.34) 

and 

x(t) = x{t) + u(t), y(t)=0 



-44- 


are clearly not asymptotically stable. In the first case the state space has 
an unstable mode, but if we start at x(0)=0 (as we would to realize G) , we can 
never excite this mode. Hence, I/O stability can tell us nothing about it. 

In the second case, we can excite the mode but we ccinnot observe it. These are 
precisely the difficulties that can arise; however, if one imposes certain 
controllability and observability conditions on the realization, one can deduce 
asymptotic stability from I/O stability. Thus, controllability and observability 
play a crucial role in translating from I/O results to Lyapunov-type stability 
results. For a precise statement of the relationship between the two, see 
[A-49,69J. 

Having established the above relationship, it is nattiral to discuss the 
generation of Lyapunov functions (which deal with internal stability) for 
systems satisfying some cype of passivity condition. Some of the most important 
work in this area is that of J.C. Willems [a- 49,50,69J.. In [a- 49,69] , Willems 
discusses the generation of Lyapunov f unctioiis for I/O stable systems. For passive 
systems he defines the notions of available and required ^energy as the solution 
of certain variational problems. If one then has a state space realization 
satisfying certain controllability and observability conditions, one can use 
these functions as Lyapunov functions. This very general, physically motivated 
theory is further developed in [A-50] . Dissipative systems and the associated 
notions of storage function (an internal variable) and supply rate (input/output 
quantity) are defined, and, much as with Pettweis' pseudopassivity, dissipative 
systems have many appealing properties (such as preservation under intercon- 
nections) . We refer the reader to (a- 50, 69] for details of topics such as the 
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constructxon of storage functions and their use as Lyapunov" functions. 

As mentioned at the end of the preceding subsection, many frequency domain 
results can be derived with Lyapunov-type arguments. We have also seen in this 
subsection that many of these results can be derived via passivity argtiments. 
Clearly the two are related, cind the crucial result that leads to this rela- 
tionship IS the Kalman-Yacubovich-Popov lemma [A-61,62 ,69] , which relates the 
positive realness of certain transfer functions to the existence of solutions 
to particular matrix ecjualities and inequalities. Kalman [A-62] utilized this 
result to obtain a Lyapunov-type proof of the Popov criterion, and Szego [A-61] 
(see also the discussion at the end of lA-33]) used a discrete-time version to 
obtain a Lyapunov-theoretic proof of Tsypkin’s criterion plus several extensions 
when the derivative of the nonlinearity is bounded. In addition, several 
other researchers [A-1,38,39] have utilized similar ideas to relate positive 
real functions to the existence of certain Lyapunov functions. It is beyond 
the scope of this paper to discuss this problem in depth, but we refer the 
reader to the references, since this area of research provides a nuinber of 
insights into the relationships among various stability concepts. In addition, 
these results provide examples of nonlinear problems for which there exist 
constructive procedures for Lyapunov functions. We also note that the positive 
real lemma plays a crucial role in several other problem areas including the 
stochastic realization and spectral factorization problem [B-21] and the study 
of algebraic Riccati equations [a-67] . 

Finally, we note that many of these passivity-Lyapunov results have ins- 
tability counterparts (e.g., see [A-1,39]). We refer the reader to the detailed 
development in [A-39] in which a Lyapunov^theoretic methodology for generating 
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instabilxty results is described. Such results itiay .be useful in developing 
sufficient conditions for the existence of non-zero„ vmdriven solutions such 
as limit cycles. 

In this section we have (Considered some of the aspects of stability theory 
that we feel deserve the attention of researchers m both disciplines. We have 
not, of course, been able to consider all of the possible topics that one -might 
investigate. For example, the ":]ump phenomenon" in which small changes in 
input lead to large changes in output .is of interest in digital 'filter theory 

and also has been considered in feedback control theory [a-42;,43] , where 
the concept of feedback system continuity is studied. In addition, claasen, 
et.al, [A-313 have introduced the concept of accessible limit cycles,, and ,its 
relationship .to concepts of controllability and also to 'the structure of the 
state transition function of the filter are intriguing questions. We also have 
not discusse'd the use of describing -functions in digital filter analysis. There 
have been several atteinpts 'in this area « (see ! [A-5,29] ) , .but none ■ of ■ these has 
proven to be too successful ' (see comments in ['A—30] ) , Except for the work of 
Parker 'and Hess [A-26] and Kalman and .Bestram [A-48] , we -have not spoken about 
bounds on the magnitudes of responses. ’In the digital filtering area these 
exist a number of results [A-31,35,56] , the latter two of which use an idea of 
Bestram’ s [a-58] as a starting point. In control theory, the notion of I/O 
gain [A-42,44] is directly tied to response magnitude bounds, although it is not 
clear how tight these would be in* any particular case. Finally, in this section, 
we have not discussed stability criteria for systems with multiple nonlinearities . 
There do exist some results in this area for digital filters (see [A**3,15]), and 
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on the other side, the general framework allows one to adapt results such as 
Hieorem A, 7 to the multivariable case with little difficulty (hence one can 
readily obtain matrix versions of Tsypkin's criterion involving positive real 
matrices). Also, the techniques of Lyapunov theory should be of some use in 
obtaining stability results much like those in [A-2] for filters of higher 
order than the second order section. 

As we have seen many of the results in the two disciplines involve the 
use of very similar mathematical tools. On the other hand, the perspectives 
and goals of researchers in the two fields are somewhat different. The develop- 
ment of a mutual iinderstanding of these perspectives and goals can only benefit 
researchers in both fields and is in fact absolutely crucial for the successful 
study of certain problems. For example, in the implementation of digital 
control systems one must come to grips with problems introduced by quantization. 
Digital controller limit cycles at frequencies near the resonances of the 
plant being controlled can lead to serious problems. In addition, the use of 
a digital filter in a feedback control loop creates new quantization analysis 
problems* Recall that limit cycles can occur only in recursive (infinite im- 
pulse response) filters, while that do not occur in nonrecursive (finite impulse 
response) filters. However, if a nonrecursive filter is used in a feedback 
control system, quantization errors it produces can lead to limit cycles of the 
closed-loop system [A-72] . How can one analyze this situation, and how does 
one take quantization effects into account in digital control system design? 
Questions such as these await further investigation. 
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B. Parameter Identification, Linear Prediction, Leaat 'Squares, and 
Kalman Filtering 

A problem of great importance m many disciplines is the determination 
of the parameters of a model 'given observations of the physical process being 
modeled. In control theory this problem is often called the system identifi- 
cation problem, and it arises in many contexts. The reader is referred to the 
special issue of the IEEE Transactions on Automatic Control 'Cb- 15] and to the 
'survey paper of Astroin and Eylchoff [B-16] for detailed 'discussions and numerous 
references in this problem area-. One 'of the imost important applications 'Of 
identification methods is adaptive estimation and control. Consider the situa- 
tion depicted in Figure B.l, Here we have a physical process that is to be 
controlled or whose state is to be estimated. Many of the most widely used 
estimation and contzol tx^chniques are based on a dynamic model (transfer 
f\mction, state space description, 'etc.) for the system under consideration. 
Hence it ns necessary to obtain .an appropriate model in order to apply these 
techniques. Often, one can perform tests on the process ^before designing the 
system and can .apply an identification procedure to determine the system. >On 
the other hand, there are many occasions in ^hich the values of certain system 
parameters cannot be determined a prion or are known to vary during system 
operation. In such cases, one may often design a controller or estimator 
"which depends explicitly on these parameters. In this manner we can adgust 
the parameters on-line as we perform real time parameter identification. A 
number of methods of this type exist, and, in addition to the tWo suirvOy 
references [B-15,16] , we refer the reader to [B-80,81,98] for other examples. 





Figure B.l; Conceptual Diagram of an Adaptive Estimator-Controller 
Utilizing On-Line Parameter Identification 
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The last of these, [B-98] is of interest, as it consists of a variety of 
adaptive control techniques all applied to the control of the F-8C aircraft, 
and thus provides some insight into the similarities, differences, advantages, 
and disadvantages of the various techniques, 

A little thought about the identification problem makes it clear that 
there are several issues. Before one can apply parameter identification 
schemes, one must have a parametric model, and the determination of the appro- 
priate structure for such a model is a complex question in itself. We will 
not consider this issue in much detail in this paper, euid we refer the reader 
to the references for details (see several of the papers in [B-15] on canonical 
forms and identif lability; also see the work of Eissanen and L^ung [B-79] ) , 
Parameter identification problems also arise in several digital signal 
processing applications. Several examples of such problems are given in the 
special issue of the Proceedings of the IEEE [B-99] , ,and these include tsee 
[B-26].) seismic signal processing eind the analysis, coding, and synthesis of 
speech. This latter -application has received a great deal -of attention in the 
past few years [B-24-26, 28-30, 44-55, 69-71, 74J , and we will use this problem as 
a basis for our discussion of the identification question. We follow the work 
Atal [B-48], Atalrand Schroeder [B-70] , Markel and .Gray [B-44] , and Makhoul lB-26] , 
Our presentation is necessarily brief and intuitive, and the reader is referred 
to these references for details. 

As discussed in Eb- 44] a popular and widely accepted model for a discretized 
speech signal {y(k)} is as the output of -a linear -system, which, over short 

1 

All of these projects were sponsored by NASA Langley. This ‘"fly-by-wire" 
adaptive control program is still in its evolutionary stages, and new methods 
and concepts are still being developed. 
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enough intervals of time, can be considered to be time-invariant 

y{z) = G(z)U(z) (B.l) 

where G represents the overall transfer function and U(z) is the z-transform 
of the input, which is often taken as a periodic pulse train (whose period is 
the pitch period) for voiced sounds and as white noise for unvoiced sounds. 

In addition, a common assumption is that G is an all-pole filter 


G(z) = 


1 


1 + 



k=l 


(B.2) 


This ass-uraption has been justified in the literature mder most conditions, 
although strong nasal sounds require zeroes [B-44] . Note that under condi- 
tion Cb, 2), equation (B.l) represents an autoregressive (AR) process 


y(k) + a y(k-l)+. .,+a y(k-p) = u(k) 
1 P 


(B.3) 


The problem now is to determine the coefficients a^,.,.,a^. Having 
these coefficients, one is in a position to solve a number of speech analysis 
and communication problems. For example, one can use the model (B.2) to 
estimate formant frequencies and bandwidths, where the formants are the 
resonances of the vocal tract [B-55] , in addition, one can use the model 
(B.3) for efficient coding, transmission, and synthesis of speech [B-70] . 

The basic idea here is the following; as the model (B,1)-(B,3) indicates, 
the speech signal y(k) contains highly redundant information, and a straight- 
forward transmission of the signal will require high channel capacity for 



-52- 


accurate reconstruction of speech. On the other hand, rearranging terms in 
(B.3) 


P 

y(k) = - 2 a y(k-i) + u(k) (B.4) 

1=1 ^ 

we see that ^(B.4) represents a predictor, in which 

P 

y(k) = - ^ a y(k-i) (B.5) 

1=1 ^ 

is the one-step predicted estimate of y. As discussed in [B-701 , one often 
(and, in particular, in the speech problem) requires far fewer bits to code the 
prediction error u than the original signal y. Thus, one arrives at an efficient 
transmission scheme (linear predictive coding — LPC) : given y, estimate the 

a^, compute u, transmit the a^ and u. At the receiver, we then can use 
(B.4) to reconstruct y (of course, one must confront problems of quantization, 
and we refer the reader to the references (e.g. , [B-119] ) for discussions of 
this problem) . An alternative interpretation of this procedure is the following: 
gives y, estimate G in (B.2), pass y through the inverse, all zero (moving 
average — MA) filter 1/G(z), transmit the coefficients in G and the output of 
the inverse filter. At the receiver, we then pass the received signal through 
G to recover y (thus this procedure is causal and causally invertible) . 
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The questxon reinains as to how one estimates the a . The most widely 

1 

used technique in the literat'ure is linear prediction . Using the inter- 
pretation of 1 - ^ one- step pre<iictor for the signal y, we wish to 

choose the coefficients a^,...,a to minutaze the sum of squares of the pre- 

^ 2 
diction errors 


J = ^ e^(n) 
n 

e{n) = y(n)-y(n) 


(B.6) 


Here we assume that we are given y (0) ,,.,,y(N-l) , Also, the range of n in 
the definition of J can be chosen in different manners, and we will see in 
the following subsections that different choices can lead to different results 
and to different interpretations. A number of these interpretations are 
given in [B-26,44] , and we will discuss several of these as we investigate 
this problem somewhat more deeply. Specifically, in the next two subsections 
we consider two linear prediction methods -- the autocorrelation and covariance 
methods — and we relate thenr to several statistical notions of importance in 
control and estimation applications. Following this, we will discuss several 
other identification methods and their relationship to the speech problem. 


2 We note that one can modify the linear prediction formulation in order to take 
into accoiant' the quasi-periodic nature of speech for voiced sounds . We refer the 
reader to [B-70] m which such a procedure is developed in which one also obtains 
an estimate of the pitch period. An alternative approach to this problem is to 
solve the linear prediction problem as outlined in the next two subsections, pass 
the speech through the inverse filter, and analyze the resulting signal to deterxaine 
the pitch [B-25,44], Recently, steiglitz and Dickinson [B-lOO] have described a 
method for improving pole estimation by completely avoiding that part of a voiced 
speech signal that is driven by glottal excitation. 
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Before beginning these investigations, let us carry out the mnimization 
required in linear prediction. Taking the first derivative of J with respect 
to the a^ and setting these equal to zero, we obtain the normal equations 

P 

S ” ”*^0k ' k=l,...,p (B.7) 

1=1 

where 

'^ik ~ S y (B.8) 
n 

These equations are typical of the types of equations that arise in linear, 
least-squares problems, and their efficient solution has been the topic of 
many research efforts. This issue is the central focus an the next two 
subsections. 

B.l The Autocorrelation Method, Kalman Filtering for Stationary 
Process, and Fast Algorithms 

Suppose we consider minimizing the sum-squared error in (Bi6) over the 
infinite interval, -oo<n<oo. Here, we define y{n)=0 for n<0, n>W. In this 
case, we find that 


G 

11 


n=0 


s(n)s(n+ji -3 I ) =r(li- 3 |) 


and the normal equations become 


(B.9) 


Ta = c 


(B.IO) 
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v?here a' = (a ), c' = (-r(l), -r( 2 ) ,,..,-r(p)>, and T 

P 

Toeplxtz matrix [B-37,84,91] (i.e. the 13 th element depends 


r( 0 ) 

r(l) 

.... r(p-l) 

r(l) 

r( 0 ) 

.... r(p- 2 ) 

r( 2 ) 

r(l) 

.... r(p-3) 


r(p-l) r(p- 2 ) . r{ 0 ) 


IS a symmetric 
only on [ 1 - 3 [ ) ; 


(B.ll) 


Before we consider the solution of (B.IO), let us derive equations of the 
very same form from a probabilistic point of view (here we follow [B-26] ) . 
Suppose that y is a stationary random process, and, instead of (B. 6 ) , we are 
interested in minimizing 


J=E(e^(n)) (B.12) 

where e and y are defined as before (although they now are random processes 
themselves) . Differentiating (B.12) as before, we obtain the normal equations 


Ta = c 

where c* = (-R(l) ,-R(2) ,...,R(p)) , T is the symmetric Toeplitz matrix whose 
13 th element is R([i- 3 |), and R(i) is the autocorrelation 

R(i) = E(y (n)y (n+i) ) (B.14) 

Examining (B.9 )-(b, 14) , we see that the two formulations are strikingly similar, 
and, one can view (B.9) as a method for estimating the autocorrelation of 
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an ergodic, stationary process [B-26] (if we normalize (B,9) appropiately) . 

This statistical point of view is extremely usefu in order to obtain certain 
insights into the approach and also in order to allow us to connect this method 
wxth. certain recent results in linear estimation theory (see [B-44] for several 
other interpretation of this method) . 

The solution of equations such as (B.IO) and (B.13) has been the subject 
of a great deal of attention in the mathematical, statistical, and engineering 
literature [B-4, 7, 26, 34, 35, 36, 37, 50, 72 ,84,91,94,95,96] . An efficient 
algorithm was proposed by Levinson [B-34] , improved upon by Durbin [B-94] , and 
studied in the speech processing context by several authors, including Itakura 
and Saito [B-50](a version of this algorithm is given later in this subsection). 
As discussed in [B-26,44] , the method essentially consists of solving forward 
and backward prediction problems- of increasing size in a recursive manner and 
IS known to be extremely efficient. That is, the' algorithm computes the coef- 
ficients a(l[i) ,,..,a(iji) for the best prediction of y(n) based on 
y(n-l) ,...,y(n-i) and the coefficients b(l| i) , ,b(r|i) for the best prediction 
of y(n-i-l) based on y(n-i) ,y (n-X) . The algorithm iterates on i. As a 
part of this algorithm, one computes the prediction error (for both forward and 
backward prediction) , and thus one can determine, when to stop based on the size 
of this quantity. Also, we must compute a coefficient k^, which is known as the 
partial correlation coefficient between the forward and backward prediction 
errors (see [B-26,44,50] ) , We will mention this quantity again at the end of 
this subsection o 

Let us now examine what this algorithm means from a statistical point of 
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view. The first stage of the algorithm produces a(l|l) and bdll), which 
are the coefficients of the best one-step predictors 

y(l) = -a{l|l)y( 0 ) 

^ (B.15) 

y( 0 ) = -bdlDyd) 

At thenext stage, we have a(i|2), a(2|2), b(ll2), b(2|2) 


y( 2 ) = -a(l| 2 )yd) - a( 2 | 2 )y( 0 ) 
y( 0 ) = -bdl 2 )yd) - b( 2 | 2 )y( 2 ) 

Continuing, we find that after i steps we have the predictors 

1 

^ a(D|i)y(i-3) 

1=1 


(B.16) 


(B.17) 


y( 0 ) = - 2 b(l li}y(l) 
j=l 


(B,18) 


Thus, we can think of the linear prediction solution as providing us with 
the time— varying coefficients of the weighting pattern of the optimal one-step 
predictor (B.17) or of the optimal initial time smoother (B.18) . Note that 
these coefficients are, in general, time varying in the following sense: 
from (B.17) , we see that aC^ji) is the coefficient that multiplies the data 
point that occurs 3 units of tune before the one whose value we wish to predict. 
If the filter were time-invariant, this would not depend on 1 . The reason 
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for the time-varying nature of the predictor coefficients is that, although 
the y’s are a stationary process, the mechanism of prediction is time-varying 
when one bases the prediction on only a finite set of data {recall that the 
time-invaricint Wiener filter assumes an infinite record of observations) . 

What does this mean as far as all-pole modeling via linear prediction 
goes? The answer to that is not much. In the all-pole modeling problem, we 
are equivalently only interested in designing a TTR filter — i.e. a prediction 
filter that produces the best estimate of y(n) gives the "data window" 
y(n-i) ,...,y(n-p) . The coefficients of such a filter are precisely 
a (1 |p) , . . . . ,a(p|p) , and it doesn't matter (except from a computational point 
of view) that these coefficients were generated as part of a time-varying 
filter weighting pattern. 

On the other hand, the time-varying weighting pattern interpretation is 
extremely important from a statistical point of view, especially if one 
wishes to design recursive predictors that are capable of incorporating all 
past measurements and not ]ust a data window. Clearly one inefficient way to 
do this IS to implement a nonrecursive filter that stores all past data , 
y(0) ,..c,y(n-l) , multiplies by the appropriate a(i[n), and combines to form 
y (n) . This requires gorwing memory and is hardly appealing. How can one avoid 
such difficulties? An answer that is popular in state-space control and 
estimation theory arises if y has a Markovian representation 3 


xCk+l) = Ax(k) + w(k) 
yCk) = c'x(k) 


(B.19) 


3 

We will briefly discuss the problem of finding such a representation later 
in this section. 
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where x xs a random n-vector (x(0) xs assumed to be zero mean), A is a cons- 
tant nxn matrxx, c xs a constant n-vector, and w xs a zero-mean uncorrelated 
sequence (uncorrelated wxth x( 0 )) wxth 

E(w(k)w* (k)) = Q (B.20) 

The correlatxon coeffxcxents of y can be computed from the equatxons 

E{y(k)y( 3 }) = c'E(x(k)x' ( 3 ) )c (B.21) 

! A^ ^P(]) k^3 

(B.22) 

[E(x(3)x’ ( k))l ' k<3 


where P xs the covarxance of x, whxch satxsfxes 

P( 3 +l) = AP( 3 )A' + Q (B.23) 

Note that xn general E(y{k)y( 3 )) wxll not depend on {k- 3 ) alone. Thxs will 
occur xf and only xf A xs a stable matrix and P=P(0) satxsfxes the Lyapunov 
equatxon 


APA' - P = -Q 

(xn whxch case both x and y are statxonary) . 
Suppose now that (B.24) holds and that 


R(ix-j|) 



Pc 


(B.24) 


(B.25) 


where the R(x) are the quantxtxes defxned xn (B.13) - (B.14) , We now wxsh to 
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design an optxmal predxctor for estxmatxng (recursxvely) y(n) gxven 
y(0) ,...,y(n-l) . Thxs xs a standard state-space estxmatxon problem [A-65] 
and the solutxon xs the Kalman fxlter (whxch actually produces a predxctxon 
for the vector x (n) ) : 


y(n) = c*x(n) 

x(n) = Ax(n-l) + AK(n-l) 


Y(n-l) = y(n-l) - y(n-l) 
x(0) = 0 


(B.26) 


where the txme-varyxng gaxn satxsfxes 


K(n) 


P(n n-l)c 
c'P(n n-l)c 


(B.27) 


Here P(n|n-1) xs the covarxance of the predxctxon error x(n) - x(n). 


P(n+lln) = AP(n|n-l)A* + Q - cc'P(n[n-l) A' {B.28) 

c'P(n|n-l)c 

Let us make a few comments about these equatxons. Note that the fxlter 
xnnovatxons Y^^J is precxsely the predxctxon error, and xts covarxance xs 
c*P(n|n-l)c, whxch xs nothxng more than (B.12). Also, recall that xn the 
all-pole framework, we could alternatxvely vxew the predxctxon fxlter as 
specxfyxng an xnverse fxlter, whxch took the y*s as xnputs and produced the 


4 I 

Note that we requxre c'P(n|n-l)o7f0. As dxscussed xn [B-67] , thxs requxres the 
posxtxvxty of the covarxance R(x) , whxch xs clearly related to the statement 
that y(n) xs not a determxnxstxc f\mctxons of y(0) ,..,,y(n-l) for any n. 
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uncorrelated sequence of predxction errors as the output. In the context 
of the Kalman fxlter, the analogous fxlter xs the xnnovatxons representatxon 
(see representatxon IR-1 of [B-67]), xn whxch we view the output of (B.26) 
as being Y(n), Finally, note that one can compute the predxctor coeffxcients 
a(][i) as the weightxng pattern of the fxlter: 


a(lll) = c'AK(O) 

a(l|2) = -c'AK(l) a(2|2) = -c*a\( 0) + c*AK(l)c'AK(0) (B.29) 


The Kalman fxlter and xnnovatxons representatxons have been the siibjects of 

a great deal of research xn the last 15 years, and the technxgue descrxbed 

5 

above has been studxed xn dxscrete and contxnuous txme , for multxple output 
systems, for txme-varyxng systems, and for systems xn whxch the actual obser- 
vations are noxsy versxons of the y*s 

z(n) = y(n) + v(n) (B.30) 

We refer the reader to the many references on thxs s\ib 3 ect, xncludxng [a-65] , 
[B-7,58,67]. 

Examinxng (B.26)-(B.28) , we see that the computatxon of the recursxve 
fxlter coeffxcxents reguxres the solutxon of the (dxscrete txme) Rxccatx 
equatxon (B.28) . If x xs an n-vector, then (usxng the fact that P xs symmetrxc) 


5 

We note that in contxnuous txme one has a somewhat more diffxcult txme — x.e. 

we don’t consider "one-step" predictxon and xn fact run into diffxcultxes xf 
we assume we observe y as opposed to a noise-corrupted version. We refer the 
reader to [B-67] and to the references therein for more on this problem. 


/ 
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(B,28) represents ^ equatxons. For reasonably large values of n, this 

can be an extreme computational load, especially given that all that is needed 
for the filter is the gam matrix K, which in the scalar output case is an 
n-vector. Also, if there are m outputs, K is nxm, and, as is often the case, 
the number of parameters in K is much smaller than the number in P(i.e. m is 
substantially smaller than n) , Thus, the question of computing K without P 
arises quite naturally, and this issue — in both continuous and discrete tame, 
in stationary and in some nonstationary cases — has been the subject of 
numerous papers in the recent past [B-1— 8,23,39,40,56,60,64-66,72,73,77] * 

It IS not our intention here to discuss these techniques in detail. What 
we do want to do is to point out that the underlying concepts that have led 
to these "fast algorithms" (at least in the stationary case) are the same as 
those that lead to the Levinson algorithm. For some historical and mathematical 
perspective on this subject, we refer the reader to [B-4,7,63, and 66], in 
particular, the extension of the Levinson algorithm to the multivariable case 
is discussed in these papers {see also references [B-35,36]). In this case, 
the matrix T in (B.IO) or (B.12) is block-Toeplitz , and the extension to this 
case is decidedly nontrivial (for other methods for handling equations involving 
block-Toeplitz matrices, we refer the reader to [B-37,56,84,91,95,96]) . Also, 
in [B-4] , the derivation of the Levinson type algorithms and Kalman gain equa- 
tions in discrete and continuous time are shown (in the stationary case) to 
rely on the simultaneous solution of forward and backward filtering problems 
(thus introducing a "backward innovation process , " representing backward' 
prediction errors) , It is also shown that both continuous and discrete algorithms 
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are obtainable from the Bellman-Krein foinnulas [B-4, 7, 42, 43, 64, 65, 661 , \»diich 
describe the evolution of the weighting pattern of the optimal estimator of a 
stationary process. From this, one can obtain the Levinson algorithms (and its 
continuous analog) and some well-known relationships with orthogonal polynomials 
tB-4,4l]. If one knows that the process y has a Markovian representation, one 
can then take the Levinson- type equations together with the state space repre- 
sentation and obtaDLn fast algorithms for the Kalman gain. An excellent treatment 
of this is given in [b- 4] , and it is recommended that the reader compare the dis- 
crete-time results here to those in [B-26,44] in order to see the relationship 
between the linear prediction equations and the version of the Levinson algorithm 
derived in [B-4] , For a thorough historical perspective, we recommend the 
survey paper [B-7] , 

In this paper we will limit ourselves to a brief outline of one of the 
derivations in lB-4] , Let y(n) be a vector stationary, zero mean process with 
covariance 

R{t-s) ® E(y(t)y(s) ') CB.31) 

g 

We observe the process 

z(n) = y(n) + w{n) (B.32) 

where w is a zero mean, uncorrelated process, uncorrelated with y, with 
covariance 


E(w(n)W(n) *)=I (B.33) 

Let y(t|r) denote the wide sense conditional mean of y(t) given z (D) , . ,z (r) , 
then [B-4] 


As before, one can take ^0 if R is positive definite, Lindquist discusses this 
in [B-4] . 
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y(t|r) = G^(t,s)z(s) 


•where the wexghting pattern is defined 


G (t,s) = E[y(t|r)y(s|r) M 


by 

G (s,t)* 
r 


(here yCil^) is the estimation error y(i) - y(3.[3), 
'(Toeplitz) equations 


(B.34) 


(B.35) 


Also, the G^ satisfy the 


G (t>s) 
r 


' G (t,s) 
r 




G (t,i)R(i-s) 
r 


R(t-i)G (i,s) 
r 


R(t-s) 


R(t-s) 


(B.36) 


Note -that y(t|t-l) is the one- step prediction estimate,, and from (B.34) we 
can identify (in the scalar case) 

G^ ^(t,s) = -a(t-sjt) (B.37) 

Comparing (B.36) , (B.37) , we see that we have similar equations (the first 
term, on the left-hand side of (B.36) comes from the presence of*, but these 
equations can also be obtained when w=0 if we can write R=R + EX for some 
positive semidefinite R — see [B-9] ) . Also, as pointed out in [B-if41 / the 
Toeplitz equations are the counterparts of certain Fredholm resolvent equations 
that arise in the continuous case [B-64,65] , 
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Lindquist*s derivation of the fast algorithms for computing , (t,s) 

t— 1 

(one-step prediction) and G^(t,s) (filtering estimate) begins with the Bellman- 
7 

Krein formulas 

G (t,s) = G (t,s) “ G (t,r+l)G (r+l,s) 
r+1 r r+1 r 

G (t,s) = G (t,s) - G (t,r+l)G (r+l,s) 
r+1 r r r+1 


We next define the "backwards weighting pattern" 
G^ (t,s) = G^(r-t,r-s) 


(B.38) 


and the matrix polynomials 

t 

^ J z®G*_^(s-l,-l) 

s=l 


(B.39) 


cl>^(z) 



U 

2 

s=l 


z G _ (s— 1/— 1) 

t— 1 


(B.40) 


As pointed out in [B-41 , in the scalar case these polynomials are related to 

the SzegS polynomials. Also, if we let (f> denote the coefficient of z^ in 

t,i 

(p (similarly for 4* ) » and if we use (B.34), (b. 35) , (B.38)- (B.40) we obtain the 
prediction and smoothing equations 


We note that the existence of two such formulas is related to the existence of 
both a one-step prediction and a filtering estimate, which is in clear distinction 
to the continuous-time case, in which we only have one such formula and filter. 
Indeed, the discrete time problem leads to a number of different types of innovations 
representations (see discussion in [B-67]) and also leads to more complex equations 
to be solved for the weighting pattern and gain. We refer the reader to [B-4,67, 
101,102] for more on the differences between the continuous and discrete tome cases. 
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y(t|t-i) 


t-i 

y (})• 

“ t,t-a 

1=0 


z(i) 


(B.41) 


y(-llt-l) 



(B.42) 


Thus, if we can recursively compute (J>^, (f)^, we can recursively solve for the 
weighting pattern of the desired predictor. Utilizing the Bellman-Krein 
equations, Lindquist derives these recursions, which yield the multivariable 
Levinson equations; 



(|>q(z)=I 

(B.43) 


«|)*{z)=I 

(B.44) 

\ - ft*) « 


(B.45) 

* * t 

^4. . ■> 

t+i t t 1 1 


(B.46) 


= R(t+1) 


t-1 

- y R(t-i)G ,(i,-l) 
1=0 





Vt 


(B.47) 


(B.48) 


Here, R plays the role of forward prediction, error, R is the backwards error, 
^ t 

I'.j. are the multidimensional analogs of the partial correlation 


coefficient introduced earlier. These relationships can be seen 3 auch more 
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easily if one looks at the scalar case and uses the following special relationships 
that hold in this case (note that these include the fact that in the scalar case the 
forward and backward predictors are essentially the same ~ a statement that is not 
true in the vector case) ; 

Then, the algorithm becomes 


= Z*‘C2) - 




r = 

t 


R(t+1) 


I 

T=n 


R(t-i)G^ 

t— 1 



(B.50) 

(B.51) 

(B.52) 


and the comparisons with the usual Levinson equations (equations (38a)- 
(38 ^ in [B-26] ) eu:e clear. 

Following this development, Lindquist next considers the case in which 

the y's have a Markovian representation. Using the algorithm (B,43 )-(b,48) , 

he is able to obtain a fast algorithm for the Kalman gam. For the details 

of this derivation, we refer the reader to [B-4] . 

Finally, we note that there are numerous physical and mathematical 

relationships between fast algorithm that have been derived in a number of 

disciplines. As discussed in [B-26, 44] , the auxiliary variable k^ in the 

8 

scalar Levinson algorithm has an interpretation as a reflection coefficient, 

g ------- I - ■ -..rill L_. ... 

We note that m the multivariable case, the k have two matrix countesrparts 
* ^ 

(F^ and F^ in [B-4]) which in general coincide only in the scalar case. This is 

t. T. 

due to the fact that the covariance matrix R is only block Toeplitz. This also 
leads to the differences between the forward and backward predictors, which in 
turn leads to an increase in computational conplexity in the vector case) . 
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and this fact has been utilized in speech processing, in vdiich these coef- 
ficients specify certain parameters in an acoustic model of the speech process 
[B-26,44], In addition Casti and Tse [B-40] Kailath [B-6,7] and Sidhu and 
Cast! [B-11] have shown that the fast Kalman gain algorithms are closely related 
to the work of certain astrophysicists, in particular Chandrasekhar [B-38] , 

who devised algorithms for solving finite time Wiener-Hopf equation arising in 
radiative transfer. Also, relationships between linear filtering and scat- 
tering theory have been brought to light in the recent papers EB-77,101,102] , 
And finally, for a good overview of some of the mathematical relationships, 
we refer the reader to Genin and Kaitp [D-145] , These ideas are of interest in 
that seeing these algorithms from several perspectives allows us to gain insight 
into their properties, potentials, and limitations. 

B.2 1316 Covariance Method, Recursive Least Squares Identification, 

and Kalman Filters 

Consider again the normal equations (B.7),(B.8). We now consider the 
range of n to be only as large as the actual data allows — i.e. , in equation 
(B.3) we will require that k, k-l,...,k-p all be within the range 0,,,,,N-1, 
.This leads to the following range for n 

p £ n £ N-1 (B.53) 

Note that in this case the normal equations become 

Sa = -d ■ (B.54) 

where d' = ^Cq^,Cq 2 '"*'° 0 p^ ' ^ symmetric matrix whose i]th 

element is c Note that is not in general a function of 1 - 3 , and thus 


S IS not Toeplitz 
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We note that this method also has several interpretations. As discussed 
by Makhoul [B-26] , one can obtain equations of identical form as the linear 
least squares predictor for a nonstationary process, in addition, as discussed 
in [B-44] , if one makes a Gaussian assumption, then the covariance method 
produces the conditional maximum likelihood estimate of a, given y (0) , . . . ,y (p-1) . 
We refer the reader to [B-44] for several other interpretations of the cova- 
riance method. 

Turning to the solution of {B.54) , we find that the fast methods described 
in the preceding section do not carry over quite so nicely, since S is not 
Toeplitz. In [B-44] , however, a method analogous to the Levinson routine, 
in that it iterates on the order of the predictor filter and computes forward 
and backward predictors simultaneously, is described. This method is not 
nearly as efficient as in the autocorrelation case, and this can be traced to 
the fact that (B.49) does not hold in this case {even for the one-dimensional 
problem) . As discussed in [B-44] , the solution to the autocorrelation and 
covariance equations can be viewed as performing a Cholesky decomposition, 
or equivalently a Gram-Schmidt orthogonalization, of T and S. In the Toeplitz 
case, very fast algorithms exist for Cholesky decomposition (see the previous 
section and [B-37] ) , while this procedure is somewhat slower for symmetric, 
non-Toeplitz matrices. Recently, however, Morf, et.al. [B-71] have obtained 
fast algorithms for the covariance method by exploiting the fact that, although 
S is not Toeplitz, it is the product of Toeplitz matrices (see equations (B.56)- 
(B.59)). We refer the reader to [B-71] for the details of several algorithms 
that essentially involve embedding the original scalar prediction problem into 
a multidimensional one to which the fast vector Levinson algorithm can be 
applied. 
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Let us take a look at the covariance method from a slightly different 
point of view. Recall that the algorithm mentioned above and the one in the 
preceding subsection involve recursions on the order of the filter given a 
fixed set of data. Suppose now we consider a recursion for updating coefficients 
of a fixed order filter given more and more data. To do this^ we refer to the 
survey paper [B-16 ] , where the covariance method, termed the "least squares 
method" is discussed Given the data y (0) , . . . ,y (N-1) , the covariance method 
attempts to find a least squares fit to the equation 


^-1^ ^N-1 


(B.55) 


where 


r 


N~1 


-y(p-l) 

-y(p-2) ... 

-y^O) 


-y(p) 

-i(p-l) ... 

-y(l) 

(B 

-y(p+l) 

• 

-y(p) 

-y(2) 

♦ 


• 

-y (n-2) 

-y(N-3) ... 

m 

• 

-'y(N-p-l) - 



1 

“1 


a = 


Lp J 


N-l 


y(p) 
y (p+i) 


y(N-l) 


(B.57) 


g 

In this survey paper the autocorrelation method — called the "correlation 
method" — is also discussed and is compared to least squares 
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The least -squares solution is given by 


^\-l ® "■ ^N-1 ^N-1 


(B.58) 


I 


which can be seen to be identical to (B.54), Thus, the covariance method 
confutes 


-1 




(B.59) 


Suppose we have a(N-l) and we now obtain the new data point y(N). We would 

A 

like to update our estimate to aCN) in a manner more efficient than re-solving 
(B,58) from scratch. Following standard recursive least squares (BLS) procedures 
[B-16] , we note that incorporation of y(N) into (B.55) adds a new equation 
— i.e, it adds a last row to 


(N) = C-y CN-1) , -y CN-2) , . . . ,-y (N-p) ) 


and a last element, y(N) , to f Thus, (B.55) takes the form 

N— 1 


(B.57) 


L , 


f 

N-1 

a = 

N-1 

1 

1 


y(N) 


and (B.59) becomes 


a(N) 




+ JUn)£' (n) 


] [Wn-1 


(B.58) 


(B.59) 
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With the aid of the matrix inversion lemma [A-65] , we can rewrite (B.59) 


a(N) = a{N-l) + K(N) Cy(N)-Ji'(N)a(N-l)] - 
where 


(B.60) 


1+5,*(N)P(N-1)£(N) 


(B.61) 


and 


P(N) = “ P(N-l) 


P(N-l)Jl(N) (N)P(N-l) 
l+£» (N)P{N-1)«-(N) 


(B.62) 


Examining these equations, we see that they represent a Kalman filter 
(see [B-17] ) . in fact, referring to [B-24,47] , we see that these are 
precisely the Kalman filter equations used by Melsa, et.al, in speech processing. 
Specifically, they consider the dynamic equations 

a(k+l) = atk) (B.63) 

y{k) - z'{k)a(k) + v(k) (B.64) 

where 

z'(k) == -(y(k-l) ,y(k-2) ,... ,y (k-p) ) ' (B»65) 

and v{k) is a zero-mean, white process with 
E(v^(k)) = ¥ 


(B.66) 



-73- 


If '¥ xs set to 1, we obtaxn the solutxon to the covarxance equatxons. Also, 
xn thxs formulatxon, P(N) has the interpretatxon as the covarxance of the 
estxmatxon error a-a (N) , 

Let us note some of the propertxes of the recursxve solutxon (B.60)- 
(B.62) . Examxnxng (B.60) , we see that the xncrement xn our estxmate a xs 
proportxonal to the error (xnnovatxons) xn predxctxng the latest value of y 
usxng precedxng values cind our prevxous estimate of a. Thxs suggests that a 
monitoring of the residuals 

r(N) = y(N) - A' (N)a{N-l) (B.67) 

can be used to help detect abrupt changes xn the predictor coefficients^*^ or 
the presence of glottal excitation xn voiced sounds. In this manner one may 
be able to improve upon the estimation of a. Whether such a procedure would 
be of value is a matter for future study. We only note here that such techniques 
have been developed and have been successfully applied to a variety of problems 
including the detection of arrhythmias in electrocardiograms [B~103,104], Also, 
it is possible to make the filter more responsive to changes in the coefficients 
by usxng one of several methods available for adjusting Kalman filters [A-65] . 
These include exponentially age-weighting old data xn favor of the more recent 
pieces of information or the modeling of a as a slowly-varying Markov process 

aCk+1) = AaCk) + w(k) {B.68) 

where A xs a stable matrix, and w is zero mean white noise with covarxance Q. 

In this case, equation (B.60)- (B. 62) become 


TJe note that Bergland [B-124] has suggested monitoring the residuals of a linear 
predictor in order to determine when to update the estimates of the predictor 
coefficients . 



(B.69) 


a(N) = Aa(N~l) + K(N) [y{n) - V (N)Aa(N-l] 


K{N) = pHn-I)&(N) 

1+A-* (N) P {N [ N-1) it (N) 


(B.70) 


P{n1n- 1) = AP(N-ljN-l)A* + Q 


(B.71) 


P(n)n} = P(NiN-l) - 

l+it'{N)P(N[N-l)it(N) 


(B.72) 


Agaxn, tiie utility of such a procedure is not clear, and further thought and 
ej^erimentation is necessary. 

Let us now consider the computational complexity of (B.60 )-(b.62) . 

First note that one does not have to compute the correlation coefficients 

(elements of s in (B.54)). However, one does have to calculate K(N) at every 

stage, and if one solves for the gain from the Riccati equation {B.62), one 

2 

has on the order of p multiplications per stage. However, Morf, et.al. [B-71] 
and Morf and Ljung [B-120] have e:^loited the structiire of the ecpiations to 
obtain fast algorithms for the direct computation of K. Combined with the fast 
3-lqorithin mentioned earlier, one now has efficient recursive procedures for 
the covariance method as one increases either the order p of the predictor or 
the number N of data points (or both simultaneously) . The most efficient 
procedure is to use p=l and process the data points successively. At the end 
of this procedure, one can then increase p until an acceptable prediction error 
IS obtained. We refer the reader to [B-71, 120] for details. 
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We also note that Gibson, et.al. [B-47] have proposed a filter of the 
same structure as (B.60) -(B-62) but that requires far fewer multiplications 
per stage (the order of p) . This procedure is based on stochastic approxi- 
mation methods and replaces (B.61 ) -(b, 62) with 


K(N) 


q&(N) 

100+iy(N) (N) 


(B.73) 


where g is a gain to be determined by experimentation (see [B-47] , We 

refer the reader to [B-24,47] for details and experimental results. 

Finally, we turn to one final note concerning the relative merits of 
the autocorrelation and covariance methods. As pointed out by Makhoul 
[B-115] , the autocorrelation method offers the advantage of guaranteering the 
stability of the resulting all-pole filter; however, the fact that the method 
relies on setting y(i)=0 outside the available range of data leads to spec- 
tral distortion. The covariance method, on the other hand, avoids the dis- 
tortion problem by not considering points outside the given range, but it need 
not lead to a stable filter. As stability for these methods is guaranteed if 
and only if all of the reflection coefficients have magnitude less than one 
[B-30,115], a number of modified covariemce-type methods that have this pro- 
perty have been devised. We refer the reader to [B-115] for a discussion of 
the relative merits of several methods and a new fast algorithm. We also note 
that Morf, et.al. [B-71] point out that if one considers a hybrid method — 
we define y(i)=0, N+l^i^N4p but do not use y(D), J<0 — we can guarantee the 


'In [B-47] the gain K(N) is calculated in a slightly different way because 
of the inclusion of quantization effects. 


stability of the resulting filter and can still obtain fast algorithms 
(due to the product of Toeplitz form of the covariance matrix) . 


B.3 Design of a Predictor as a Stochastic Realization Problem 

A problem that has attracted a great deal of attention in the control 

and estimation literature is the stochastic realization problem [B-7,11,13, 

15,20,21,22,37,63,67,72,85,90,105], Briefly stated, (a special version of) 

the stochastic realization problem asks the following; given a stationary 

12 

Gaussian random process y (taken as a scalar here for simplicity ) with 
correlation function R(n) , find a Markovian representation 


x(n+l) = ax(n) + w(n) 
y(n) = c’x(n) 


(B.74) 


where w is a zero mean white noise process with covariance Q. Referring to 
(B,19 )*'(b, 25) , we see that this is equivalent to finding a factorization of 
R of the form 


RCi) = c’A' 




(B.75) 


where 


b = Pc 
APA’-P = -Q 


(B.76) 


Examining (B.75), (b. 76), we see that the algorithm falls naturally into 
two pieces; (1) find a triple {A,b,c) satisfying (b. 75) ; (2) find P and Q 
satisfying (B.76) . One of the best-known studies of this problem is that of 


12 


We note that the various algorithms discussed in this section have been extended 
in most cases nontrivially — to the vector case. 
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Faurre [B-21,57,85] . as he pointed out, the first step of the algorithm is 
simply the well-known deterministic realization problem when one is given 
the "weighting pattern" R(0) , R(l) , R(2),,.. , This problem has been widely 
studied in the literatxire [B-9, 10, 11, 12, 13, 14, 72 ,106,107] , and we will make a 
few comments about this aspect of the problem in a few moments. Before dis- 
cussing the numerical aspects of the first step or the details of the second/ 
let us see what the first part yields in the frequency domain (here we follow 
[B-63] ) . i,et us define the power spectral density 

+00 

S te) = 2 

1 =-'=° 

Then, using the fact that R(-i) = R(i) , we see that the factorization (B.75) 
yields 


Sy(z) = c’ (zI-A) ^zb + c'(z ^I-A) ^Ab (B.78) 

Noting the form of (B.78), and defining 

a(z) = det(zI-A) (B.79) 

we see that the first step in the algorithm yieldsl^ 


13 1 

If we had realized y R(0) ,R(1) ,R(2) ,..., instead of R(0) ,R(1) ,R(2) , . .. , we 

would have a more symmetrical version of (B.78) (see [B-63] } , Note that equality 
of (B.77) and (B.78) is as formal power senes. 


^^ote the assumption that we can factor R as in (B.75) implies (and is 
implied by) the fact that Sy(z) is a rational function. 
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S (z) 

y 


p(z) 

'-I 

a(z)a(z ) 


(B.80) 


That IS, we have obtained a factorization of the denominator of S . If we 

y 

can. also factor the numerator 


S (^) . 

^ - a(z)a(z ) 


(y>0) 

15 


we will have determined the desired transfer function 


(B.81) 


G(z) 


g(z) 

a(z) 


(B.82) 


1 / 2 . 

which, when driven by white noise with spectrum li ^ , yields the spectrum 
Sy(z}. It IS clear from (B.74) that it is this second part of the spectral 
factorization that is accomplished by the second step of the stochastic rea- 
lization algorithm. Finally, note that the model (B.82) contains~ both poles 
and zeroes (it is an autoregressive-moving-average (iiRMA) model) . 

There are several methods for performing the second step of the algorithm, 
Faurre [B-21,85] showed that (B.76) could be solved for any P inside a given 
range 

P* <P<P* (B.83) 

(here inequality is in the matrix sense) , and he -identified the smallest such 
covariance, P*, as that arising from an innovations representation of y — i.e. 


choose 0 and a to consist of those poles and zeroes of S (z) that lie 

y 

within the unit circle. This will guarantee the stability of G and of its 
inverse (see [B-63] ) » 
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a Kalman filter (see Gevers-Kailath [B-67] for a full description) . This 
representation is of the form 


5(n+l) = A?{n) + Ke(n+1) 
y(n) = c'^(n) 


(B.84) 


where £ is an innovations process with covariance 

= c*b - c'P^c (B,86) 

and is the solution of the algebraic Riccati equation 

A[b-P^c] [b-P*c] *A' 

P. = AP*A' + (B.87) 

c'b-c'P^c 

Then the Kalman gain is given by 
[b-P^c] 

K = (B.88) 

c'b-c* P^c 

Comparing this with (B.26), we see several differences. First of all, in 
(B.26) we had an equation of the form 

x(n+l) = Ax(n) + AK£(n) 

ys 

y(n) = c'x(n) + e(n) 


(B.89) 
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The differences between (B.84) , (B.89) can be explained by noting that (B.84) 
represents a representation based on the filtered estimate of x(n) (given 
y(0) (n)) and (B.89) is the one^step predicted estimate of x{n) (given 
yCO) ,...,y(n-l)) . We also note that it is easy to pass from one of these re- 
presentation to the other (see [B-67] ) . 

Thus, examining (B.84) - (b.88) , we see that the second step of the 

algorithm consists of solving the equations defining a steady-state Kalman 
filter, and again the most difficult step is solving for the covariance — in 
this case P*- from the nonlinear equation (B.87). However, note that P* 

Itself IS not needed in (B.84). All we really need are and K. Thus, an 
alternative procedure is to use the "fast algorithms," as described in Sub- 
section B.l (see [B-63,69] for the development of this idea). These will pro- 
duce the time-varying hi tones of K and R^. If we let the transients (due to 
the finite data with which the filter must work to produce an estimate) die out, 
we will obtain K and R^. We note that although this approach involves solving 
for K and R^ recursively (in time) , this procedure may be much faster than direct 
solution of (B.86)-(B.88)-, 

Before turning to an alternative approach, let us note that once we have 
K, we have in fact determined the optimal recursive predictor or filter (i.e. 
comparing (B.26) and (B.89) , we can readily turn the innovations representation 
into a one-step predictor) . Note also that this model ^ causal and causally 
invertible [B-67,69J and hence the method can be interpreted as an inverse 
filter approach to the identification of G(z-) — i.e. ^we have equivalently 
determined the optimal predictor or a whitening filter. Also, as mentioned 
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before, this method allows for zeroes in the model. A method of this type 
was proposed in [B-691 , Actually, in that reference it was proposed that one 
might benefit from the use of the time-varying innovations representation 
(before it reaches steady-state) . We refer the reader to [B-69,723 for more 
on the time-varying problem. We will have more to say about the numerical 
aspects of the steady- state algorithm in a moment. 

There is cin alternative approach to the Kalman filter method for finding 
a factorization of the numerator of S^(z), Examining (B.80) , suppose we pass 
the process y through the all zero filter a(z) . The resulting process ri has 
power spectral density p(z) — i.e, it is finitely correlated (moving average 
(MR) process. Given its correlation function p(z), one wishes to factor it 


p(z) 



/a ) 

1 ^ ^ 



V- ^ 1 

' \x=0 ^ / 


= 3(z)3(a”^) 


(B.90) 


As described in [B-11, 13, 37/56] , this is equivalent to obtaining a 
factorization of the infinite symmetric Toeplitz matrix (with finitely 
many nonzero diagonals) 
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(B.91) 
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into the product of an upper triangular matrix and its transpose. Recursive 
procedures for this are discussed in [B-37] , and clearly the Levinson-type 
algorithm can be used in this scalar case. As the recursion proceeds, certain 
of the elements of the Cholesky factor converge to the desired 3^ (see [B-13] ) . 
Clearly an alternative to this procedure is to find the innovations represen- 
tation of f| using the fast algorithms described earlier. This method is 
closely related to the "fast Cholesky" algorithms, and the reader is referred 
to Eb- 63] for details (see also [B-72] 0, For a detailed discussion and new results 
on the use of the Riccati equation for spectral factorization, we refer the 
reader to [B-112] . 

Let us now turn to the numerical aspects of this two-stage procedure. We 
concentrate here on the first stage — i.e. the computation of the factori- 
zation (B.75). The algorithms of Rissanen [B-11] and Ho [B-106] are based on 
examination of the Hankel matrix 


R(0) 

R(l) 

R(2) 

.... R(N-l) 

R(l) 

R(2) 

R(3) 

R(N) 


R(N-l) R(N) R(N+1) .... R(2N-2) 


(B.92) 


It IS well-known [B-107] (see also S\*section C.l) that R ad m its a 'factorization 
(B.75) if and only if there is some integer n such that 


rank £ n 


Vn 


(B.93) 



83- 


Hb''s orxginal algorithm yielded a minimal realization (i.e. dim A in. (B.75) 

IS a small as possible) if a bound n was known in advance. A far more 
critical question (from a practical point of view) is the partial realization 
q[uestion. Here we take into account that we only have available a finite 
number of correlations R(0) , R(l) . ,R{ n- 1) , and one would like to obtain 
the minimal factorization that matches these. One can use Ho's algorithm for 
this, but it is not recursive — i.e. if we incorporate R(N) , we must re-sol ve 
the whole problem. Fortunately, Rissanen [B-11] and Dickinson, et.al. [B-9] 
have developed efficient, recursive procedures (the latter of which is based 
on the Berlekarap-Massey algorithm [B-10] , which was developed for the scalcir 
easel. We note that these algorithms essentially solve the Bade approximation 
problem, and we refer the reader to the references for details. 

Thus, efficient algorithms exist for spectral factorization and one would 
expect good results if the process y truly has a Markovian representation cind 
if one has the exact values of the correlations. This points out a conceptual 
difference between linear prediction and the above stochastic realization 
procedure. In linear prediction, no pretense is made about exactly matching a 
model. All that is wanted is a least-squares fit, and thus one would expect 
this procedure to be relatively robust when one uses a finite record of real - 
data to generate an estimate of the correlation function which is then used in 
the linear prediction procedure. On the other hand, it can easily be seen 
that an infinitessimal perturbation of in (B.92)' can make it have full rank. 
In this case, the partial realization procedures — which in essence are looking 
to match a model exactly — will yield a system of extremely high dimension. 
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Thus, it appears that these algorithms are inherently sensitive to errors in 
estimates of the correlation coefficients. In addition, if y has no Markovian 
representation, the linear prediction approach will still work fine, but the 
partial realization procedures, which are based on exact model matching, may 
very well run astray as it tries to fit the data "too closely". 

Does this mean that the above procedure is of no use in identifying para- 
meters in a speech model? The answer to that is perhaps not. What is needed 
IS a modification of the first step of the stochastic realization algorithm. 

As the version described here stands, it is too sensitive and in fact, DeJong 
[B-108] has shown that these methods are numerically unstable in that the 
inexact minimal realization supplied by these algorithms, as implemented on a 
finite wordlength conputer, may not be a "numerical neighbor" of the sequence 
{rCD } that IS to be factored. A great deal of the difficulty is due to the 
illposedness of the problem of finding the rank of the Hankel matrix. By 
rephrasing the algorithm in terms of the e-rcink — the least rank of all sys- 
tems within an "e-neighborhood" of the given sequence — De Jong obtains a 
slower algorithm that is similar to Eissanen's but is numerically stable. 

This approach is extremely appealing for two reasons: (1) We can, within this 

framework, seek minimal realizations in the e-neighborhood of a sequence 
{rCi) } that itself is not realizable by a finite dimensional system; (2) We 
can seek the "nearest" reduced-order realization of given dimension of a given 
system. These two properties may help overcome some of the sensitivity problems 
with the two step procedure. 
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In addxtion to the work of De Jong, a number of other methods have been 
proposed for "approximate" Fade approximations, and any of these could be 
used as the first step in the algorithm. McDonough and Huggins [B-113] propose 
to approximate a time function fCt) by a sum of (possibly complex) exponentials 

N 

s t 
1 
e 

They study numerical methods for the iterative determnation of the and 

s that minimize 
1 

T 

/ e^Ct)dt 
0 

where e is the signal error 

e(t) = f(t) - f (t) 
a 

One needs iteration^ as this is a nonlinear problem. This is clearly closely 
related to the discrete~*time problem of finding {A'>b,c} with A nxn (n fixed) 
to minimize some function of the error 

e(i) = R(i) - cA^b 

Some effort has been put into this problem in the recent past [B-19,75,110] , 
and one possibility, of course, is the all-pole approximations — e.g. we 
might perform linear prediction with the R(i) as the observed signal (regarded 
as the impulse response of some filter) . This would require computing the 
correlation of R(i) , or, in other words, the correlation of the correlation of 


f (t) 


= 2 


1=1 
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the y(i)I Note that the all-pole assumption for R(i) would not necessarily 
lead to an all-pole model for G(z) in (B.82) . 

Another possible method has been proposed by Burrus and Parks lB-1143 . 
They consider approximating 


-1 


-2 


r(z) = R(0) + R(r)z + R(2)z + 


by 


GCz) 


-1 -N+1 

w 

1+b z“^+...+b 

1 M—1 


a(z) 

b(z) 


In addition to specifying some exact realizability conditions on {r(i) } 
(which can easily be reduced to Hankel matrix conditions and statements) , 
they suggest the following: we would like 


r(z) 


a(z) 

b(z) 


Multiplying by b(z), we obtaj-n 


b(z)r(z) = a(z) 

and if we attempt to minimize some norm on the difference between these quan- 
tities (called the equation error ) , we can obtain linear approximation 
algorithms. We refer the reader to [B-114] for details. 

We close by noting that initial results ( [B-109] , [B-111] ) utilizing the 
two-step procedure indicate the potential of the approach. In particular, the 
work at IRIA [B-109] has produced good results for the design of whitening 
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(anverse) filters. Given this limited success and the previous discussion, 
it appears that the utility of the two-step stochastic realization procedxares 
merits further investigation, 

B.4 Some Other Issues in System Identification 

It IS appropriate to mention several other identification procedures. 

Eecall that in Subsection B.2 we saw that the covariance method was equiva- 
lent to a Kalman filter when we recursively update our estimates of the 
predictor coefficients. As discussed in [B-17] , several other recursive 
identification schemes can also be considered as Kalman filter-type algorithms. 
One of these is the instrumental variables approach, which bears some simi- 
larity to the least squares algorithm and which, in fact, leads to Toeplitz 
equations in the stationary case [B-91] . In that reference it is pointed 
out how one can devise the Toeplitz Yule-Walker equations to determine the 
pples CAK, part) in an ARMA model, This is essence requires knowledge of 
the order of the MA part and thus is much more apt to lead to the sensitivity 

problems that one confronts in using a technique that is based on the as sump- 
/ 

tion that the data obeys certain constraints (as in the first step of the 
stochastic realization algorithm of the preceding subsection) . In addition, 
we no longer are guaranteed that the solution to the Yule-Walker equations leads 
to a stable inverse filter. 

The methods of least squares (covariance) and instrumental variables, as 
described in [B-17] are used for all pole models of the noise (prediction error)/ 


This method is similar in spirit to the Burrus-Parks generalized-Pade-equation- 
error approach for the determination of the denominator of a pole-zero model 
[B-114] . 
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output behavxor (i.e. for AR models) . However, both the usual least squares 
and the instrumental variables can be easily modified for the identification 
of input zeroes — i.e. consider the model 


y(k+l) + a^y(k) +. 


a y(k-p+l) = b u(k) + b u(k-l) +. 
p U 1 


.+ b u(k-m) + e{k) 
m 

(B.94) 


where we measure both the y*s and u's (here £(k) is the driving noise, or 
equivalently, the "equation error") ^ In this case, let 

6’ = (— a , . . . a ,b , , , , ,b ) (B.95) 

1 P u m 

({>' (k) = (y(k) ,...,y(k-p+l) , u(k) ,. ,.,u(k-m) ) (B.96) 

Then the recursive least squares procedure reduces to a Kalman filter for 
the system 


0(ktl) = 0(k) 

y(k+l) = (f)» (k)0{k) + e(k) 


(B.97) 


Although this input model is not of interest in the speech problem, it of 
great importance in control applications in which one is interested in 
manipulating the system via the input u. We refer the reader to [B-17] for 
the analogous development for the instrumental variables method. 

There are two other algorithms in [B-17] that are of interest. These 
methods allow zeroes both in the input/output response and in the noise/output 
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response — i.e. they can be used to identify ASMA models. Both of these 
algorithms are recxirsive (in the data) , approximate maximum likelihood methods, 
and both methods are of the Kalman filter type. The second of these (RML2 m 
[B-17]) IS discussed in detail in [B-18] . The first of these, EMLl is, in 
some sense, an approximation to 5ML2, and we outline the basic idea. Consider 
the ARMA model 


yCk+1) + a y(k)+, ,,+a y(k-P+l) = e(k) + c,e(k-l)+. ..+c e(k-q) 
Ip 1 q 

We can rewrite Cb. 98) as 

0(k+l) = 0(k) 

y(k+l) = Ck)-0(k) + e(k) 

where 

0* = (”3- , ...,"*a , c ,...,c ) 

1 pi q 

(|) ' (k) = (y Ck) , . . . ,y (k-p+1) , e (k-1) , . . . ,e (k-q) ) 


(B.98) 


(B.99) 


(B.lOO) 


Having {(), one could again devise a Kalitian-filter structure for the estimate 

0. However, the noises, e, are not known. As suggested in [B-17, 82, 83] a 
natural approximation is to replace e( 3 ) in (B.lOO) by its estimated value — 

1. e. the residual 

e(l) = y(D+l) -<(>*(d)9(d) 


(B.lOl) 
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If we do thxs, we obtain the following recursive scheme: 
e(D+i) = eCj) + K(:3+l)e(3) 


(B.102} 


K (3+1) 


P(3)<|)(3) 

1+r (3 )P(d)?(d) 


(B.103) 


P(3+l) = P(d) 


P(l)(j)(3)$(3) 'P(l) 
l+i}>' (l)P(3)t (l) 


= (y ( 3 ) f ••• fy(D-p+i) / e(3'-i) f-weCi-q)) 


(B.104) 

(B.105) 


We refer the reader to [B-17] for a detailed description of this and the 
other algorithms. In addition, uniqueness of stationary points and the sta- 
bility of these algorithms is considered in detail in this reference In 
particular, it is shown that rls is stable and has a unique solution, that 
RMLl and rMI. 2 have unique solutions for ARMa models, that RML2 always converges, 
and that RMLl converges for MA models, for first-order ARMA models, but that 
it may diverge in higher-order cases (an example is given) . The reader is 
referred to lB-17] for details, further references, and for many insights into 
the characteristics of these identification procedures. Also, we refer the 
reader to [B-120] for fast on-line algorithms for these identification schemes. 
These methods are analogous to that mentioned earlier for the covariance method. 

We note that the methods described above and in the preceding subsection in 
principle allow one to identify poles as well as zeroes. In addition, several 
other methods for zero modelling have been described in the literature [b-26,68. 
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100,113,114,123], The method xn [B-68] xs based on nopg-h-rai analysis . Let 
Y(z) be the z-transform of a signal y, which we wish to model as the response 
of an ABMA model: 


{B.106) 

avoid the zeroes [B-26,68,100] ) 
complex cept 2 Tum y(n) so that 

Y(z) = log Y(z) 

A 

Then the z transform of ny(n) is 


Y(z) = 


N(z) 

D(z) 


Usual linear prediction (with care taken to 
will identxfy D. Suppose now we defxne the 


„ dy(z) _ D(z)N'(z)-N(z)D* (z) 

“Z — — - -z r — ~ — IB.107) 

dz nCz)D(z) 

and thus linear prediction on ny{n) will identify the zeroes (and the poles) 
of y. We refer the reader to [B~26,68] for more on this technique. In 
additxon, the generalxzed Fade methods in [B-113,114,123] can also be used for 
pole-zero modeling dxrectly (as well as for the first step of the two-step 
procedure of the preceding section) . Also Atashroo and Boll [B-21] have sug- 
gested a multi-step procedure in which one performs linear prediction to obtain 
the poles, inverse filters to obtain a finitely correlated sequence, uses linear 
prediction again to obtain a high-order all-pole model of this sequence, and 
then performs a third linear prediction to obtain a lower order all-zero inverse 
of the all-pole latodel. 


One further issue that we have not discussed is the determination of an 
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appropriate order for the parametrxc model to be xdentxfied. Clearly, as 

we allow more and more free parameters, we can get a better and better £xt, 

but one would ej^ect a dxmxnxshxng return beyond a certaxn nxiiriber of parameters. 
0 

Astrom and Eykhoff [B-16] propose one test crxterxon, whxle Akaxke [B-32,92; 
B-15, p.716] (see also [B-26]) proposed an xnforraatxon-theoretxc crxterxon 
whxch provxdes a direct tradeoff between the value of the log- Ixkelxhood func- 
txon and the number of free parameters xn the model. Recently, Rxssanen and 
L^ung [B-79] have obtaxned a related crxterxon that xncorporates the assumed 
model structure (as well as the number of parameters) . 

In thxs sectxon we have examxned a number of aspects of the xdentxfxcatxon- 
estimatxon problem, and we have poxnted out a number of sxmxlarxtxes between 
the goals and techniques of the two dxscxplxnes. We have also seen some of the 
differences, but others have not been discussed. In partxcular, xn thxs sectxon 
we have treated xdentxfxcation for xdentxfxcation's sake. As poxnted out xn 
[B-16] xn control system design,, identification xs often simply a means toward 
the goal of efficient control, Thus, xn many control applications, the value 
of identification xs not measured by the accuracy of the parameter estimates, 
but rather by the performance of the overall system. This xs discussed somewhat 
in [B-17] and also xn the study of "self-tuning regulators" [B-80,81], in 
addition, xn control one has several types of identification problems, since one 
has the opportunity to excite the system through inputs. One finds somewhat 
different problems 'if the system is operating open loop, in a time-invariant 
closed-loop mode, or in an adaptive closed loop mode. We refer the reader 
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to [B-15,17] for more on this snbject and for further references. Finally, in 
the control context, one often deals with systems for which one is interested 
in determining system structure as well as in identifying the parameters of a 
model. The issues involved here are complex and are discussed in [B-15,16]. 

On the digital filtering side, one is often interested in the accuracy of 
the parameter estimates. This is of importance, for example, if one is at- 
tempting to design an all-pole filter that matches a given impulse response in 
a least squares sense, or if one is attempting to estimate formants from 
an all-pole speech model. On the other hand, for linear predictive coding, 
the accuracy of the parameters may be of secondary interest, while the primary 
concern is more efficient coding of speech data. In this case, accuracy is 
of importance only in so far as it makes the coding scheme more efficient. 

In this regard, a very important question involves the quantization of the pre- 
dictor specifications — that is, what is the most efficient method for trans- 
mitting the specifications of the all-pole model. As discussed in [B-119] , 
the reflection coefficients (from which one can construct the filter) offer 
the most efficient parametrization from a quantization point of view. 

We note that the linear prediction approach appears to be particularly 
well-suited to the speech problem. The all pole model is a good one in many 
cases (from a physical point of view) , the algorithms are fast, the intermediate 
variables in the algorithm (i.e. the partial correlation coefficients) have 
useful physical interpretations, the linear" prediction procedure tends to match 
the spectral envelope, etc. (see [B-26] for many of the properties of linear 
prediction and [B-116] for some of its statistical properties). Finally and 
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above all, linear prediction has been proven in practice to work well on speech 
signals, and fiarther work is needed before one can say with confidence that 
any of the other techniques described in this section can improve upon this 
performance. 

Thus, we see that there are a surprising number of relationships, simi- 
larities, and differences among the techniques and goals of researchers in both 
disciplines who are concerned with parameter identification. The possibilities 
for collaboration and interaction that will benefit all involved seem particu- 
larly abundant in this area. In particular, we have barely scratched the 
surface on the question of the relative merits of the various methods or the 
issue of precisely what problems a particular method addresses and does not 
address. A thorough investigation of questions such as these remains for the 
future. 
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C, Synthesxs, Realxzation, and Implementatxon 

In thxs section we consxder the questxofi of desxgn. However, our dxscus- 
sion will not deal very much with desxgn methods but rather with the question 
of trying to pinpoint what researchers in the two disciplines mean by "design" 
and what sorts of problems their techniques are equipped to handle. As we 
shall see, the issues considered in the^two fields are often quite different, 

- but there are many occasions in which techniques from one discipline could be 
of use in the other. Also, the problem of implementation confronts designers 
in both disciplines. 


C.l state Space Realizations and State Space Design Techniques 

State space concepts and methods have a number of uses from a design point 
of view. Let us first take a look at realization theory [A-64,68 ,b- 12,C-2-13] . 
Let us recall some of the basic concepts from realization theory (see [A~64, 
B--12, c-2] for details and for further references). We will follow [B-12] and 
will state several results in the continuous-time framework, but analogous re- 
sults hold for the discrete-time problem (.see the last part of Subsection B.3). 
we are interested in time-varying linear system representations of the form 


x(t) = A(t)x(t) +B(t)u(t), 
y(t) = C(t)x(t) 


X(tQ)=XQ 


(c.l) 


where x(t)eR^, u(t)eR”*, y(t)£R^, and A,B,C are matrices of appropriate dimension. 
Prom an input-output point of view, the system (C.l) is equivalent to the 


repre sentation 
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y(t) = C(t) $(t,tp)xQ + 



C(t)f (t,T)B(T)u(x)dT 


(C.2) 


where $ is the nxn state-transition matrix 


#Ct,CT) = A(t)#(t,(J) , f(o-,cr)=I (C.3) 

and the matrix 

H(t,T) = C(t)f (t,T)B(T) t^T (C.4) 

IS the impulse response matrix. As pointed out in [B-12 ] , in many control and 
estimation problems, we are often interested in the weighting pattern matrix^ 

K(t,T) = C(t}f (t,T)B(T) t,T (C.5) 

If A,B, and C are constant, then # and K have particularly nice expressions 

#{t,T) = , K(t,T) = Vt,T (C.6) 

and in this case, given the dependence on t-T only, we write K{t,0)=K(t) , 
H(t,0}=H(t). Also, in this case, an equivalent input-output representation 
IS provided by the Laplace transform of H(t) — the transfer function 

G(s) “ L[H-<t)], = C(ls-A)“^B (C.7) 


As mentioned in [B-12,C-10], if K is real analytic in t and T (as it is if A, 
B,C are constant), then (C.4) , (C.5) are equivalent, since H has a xinique extens- 
sion to T>to Otherwise, there can be nonunique extensions [C-IO] . 
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Tiie realization problem, then, is to obtain a recursive, description of 
the form (C.l) when we are given the weighting pattern, impulse response function, 
or transfer function. It can easily be seen that if a realization exists, then 
many solutions exist. For example, we obtain the same weighting pattern as (C.l) 
if we take ^=2x to be our state variable 


t(t) = A(t)5(t) + 2B(t)u(t) 
yCt) = j C(t)5(t) 

or if we take r|’ = (x,0) as our state 


(C.7) 


n(t) 


A(t) 0 


B(t) 


n(t) + 


1 

o 

R 

L_ 


3(t) 


(C.8) 


y(t) = [C(t),Y(t)]n(t) 


where a is arbitrary and either 3 or Y identically zero. These two examples 

illustrate the two basic issues that arise. In the first case, ^ and x are in 

some sense equivalent, since they contain identical information and one can be 

obtained from the other via an invertible linear transformation. This is not the 

case in the second example, in which T1 carries superfluous information (from an 

input-output standpoint) in its last component Ti . If 3=0 , the input can 

n+JL 

never affect ri . {a controllability problem) / while if Y—0/ the output never 
n+1 


C 
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sees dxrectly or indirectly (since decoupled from the other state 

components)— an observability problem. 

Thus, one of the key issues in realization theory involves the characte-, 
rization of minimal realizations — those which contain no superfluous infor- 
mation in their state variables. We refer the reader to the references (see, 
in particular, [B-12] ) for the full development of realization theory for time- 
invariant multivariable systems. As one might guess from the preceding para- 
graph, the concepts of controllability and observability are very closely tied 
to the minimality of a state space realization. For the sake of bravity, we 
state the ina^or results only for the time-invariant case (i.e. stationary 
weighting pattern and constant realizations of it) . 

Definition C.l ; A realization (time-varying or time- invariant) of a 
weighting pattern or transfer function is minimal if any other realization has 
a state vector of dimension at least as large. 

Definition C.2 ; A constant linear system in state space fomn (C.l) is control- 
lable if for every state xeR^ and any T>0 there exists an input function 
u(t), tetO,T] that drives the system from x(0)-0 to x(T)=x. 

Definition C.3 ; A constant linear system in state space form (C.2) is obseirgable 
if, for any T>0, given u(t) and y(t), teI0,T], we can uniquely deteimiine x(t) 
in this interval.^ 

Theorem C.l ; Suppose we are given a stationary impulse response matrix H(t) or 
its transfer function G(s). This system has a state-space representation of the 
form (c.l) if and only if G(s) is a matrix of rational functions of s, each of 


For time-varying systems, the intervals over which one tries to control or 
observe the system may vary with time (see [A-64,B-12,C-2,C-10] ) . 
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3 

whxch is proper (degree (denom. ) >degree (nvnner.)) . In this case, G{s) has 
a minimal, constant realization, and, in fact, a realization 

x{t) = Ax(t) + Bu(t) 
y(t) = Cx(t) 

IS minimal if and only if it is controllable and observable. In addition, 
any minimal constant realization can be obtained from a given one via an in- 
vertible linear transformation of the state variable, ot equivalently 

(a,B,C) ►(PAP"^,PB,CP~^) (C.IO) 


Finally, if dim x=n, the realization (C.9) is controllable if and only if 


tank 


B:aB^.,.:A’^~^B =n 

• « • 


and It IS observable if and only if 


rank 


[ ’*!’* * n-1 * 

C :a C ! (A ) C =n 


(C.ll) 


(C.12) 


We note that essentially the same result holds in discrete time, in 
which we have the (z transform) transfer function G(z) and we wish to represent 
it as 

G(z) = C(lz-A)”^B (C.13) 


which is equivalent to the state space description 


x(k+l) = Ax(k) + Bu(k) 
y(k) = Cx(k) 


(C.14) 


It IS easy to allow deg (denom) =deg (numi ) by including a feedthrough term: 
y(t) = C(t)x(t) + D(t)u(t), This IS readily taken care of and leads to minor 
modifications of the results stated here. 


- 100 - 


Exaiainxng (C.7) and (C.13) , we see that any algorithm that realizes the 
continuous time system G(s) also is a valid realization algorithm for the 
discrete-time system G(z) (and vice versa). We thus will turn to the discrete- 
time framework for a moment in order to gain some insight into the realization 
question. 

As discussed in [A-64,B-12,106,107,c-2] , there are relatively simple algo- 
rithms for obtaining controllable or observable realizations of G{z) , assuming 
it IS given in rational form (so that we can compute the least common denominator 
of all of the elements of G) . The algorithm of Ho [B-106,107] and that of 
Silverman and Meadows [G-13] provide methods for extracting minimal constant 
realizations from the Hankel matrix (see Subsection B.3) , Basically, in this 
approach one writes G(z) in series form 


G(z) 


CO 



(C.15) 


and we recognize that is the impulse response sequence, 

(C.13), the realization problem is equivalent to finding A,B,C 


Referring to 
so that 


T = CA B 
1 


Vi 


(C.16) 


As described in [A-64 ,b- 106,107] , one can find such a factorization if and 
only if the ranks of the Hankel matrices 
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H = 
N 


T„ 


T. 


N-1 


.. T. 


N 


T T 

L N-1 N 


T, 


2n-3 


(C,17) 


are bounded by some integer (and then the maximal rank of the is the di- 
mension of the minimal realization of G) . If G is proper rational, one can 
show that this is indeed the case and, given the degree of the least common 
multiple of the denominators of elements of G, can find a particular that 
achieves the maximal rank. From this matrix, one can then extract the minimal 
realization [B-12,106,107 ,C-l3] . However, if we are given G in the form (C,15) 
as opposed to in rational form, in general one cannot easily detejnnine if G 
IS rational (or equivalently if the ranks of are bounded) . In this case, 
the partial realization algorithms discussed in Subsection B.3 are of use. 

These algorithms essentially produce minimal dimension systems of the form 
(C.14) that match the expansion (C.15) up to some specified power of (z — 
i.e. these systems match the impulse response out to some specified point. 

As mentioned earlier, these algorithms have numerical difficulties which must 

•c 

be- overcome. However, if G 3 ^ given in rational form, the algorithms of Ho- 
Kalman and Silverman-Meadows provide a procedure for deteirmining minimal 
realizations (see also [A-64] for a_ procedure based on partial fraction 
ejq)ansions) . 
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Thus, the realization problem can, in principle, solve certain questions 
related to system synthesis. The input-output description (C.2) for continuous 
systems or the analogous one for time- invariant, discrete-tune systems 

n-1 

y(n) = ^ T u(i) (C.18) 

IS non-recursive in nature — i.e. equation (C.18) implies an algorithm in 
which at each point in time the entire past sequence of input vectors 
u(0) , . . . ,u{n-l) are multiplied by the appropriate impulse response matrices 
and then summed. Clearly such an approach is feasible only if the system to 
be implemented has a finite impulse response (ETR — T^=0 V^^some integer). 

In general, however, (C.1&) requires growing memory, and even in the ETR case, 
the nonrecursive implementation may require exorbitant amounts of storage. 

In this case, recursive implementations are called for, and the state spce 
realization (C.14) provides an answer to this question. In fact, the computa- 
tion of minimal realizations allows one to find out the minimal amount of 
storage that is needed in any linear, recursive realization, and one of the most 
important aspects of the state-space approach is that it allows one to consider 
multiple input/multiple output systems and time-varying systems. It is this 
last point — the ability to handle multivariable and time-varying systems — 
that IS one of its most important assets from a synthesis point of view. 

One field in which state space realization theory has played a ma^or role is 
in network synthesis for both time- invariant and time-varying circuits, A 
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number of papers have been written in this area (see [C-3-9J and the references 
therein), and, in fact in some of this work (see, for example [C-4] ) , realiza- 
tion concepts are tied together with some concepts concerning dissipative systems 
(see Subsection A,2) to yield useful results in network synthesis. 

We will not discuss analog network synthesis further, since our ma^or 
concern is with relationships with the implementation of digital filters. This 
topic will be looked at in some depth in the next subsection, and thus we con- 
tent ourselves at present with making only a few comments. For discrete-time 
systems, the state-space approach tells us the minimal amount of storage — i.e. 
the minimal number of delays — that are needed to realize a given transfer 
function. In addition, we know how to obtain any minimal state space realization 
from a given one — i.e. we apply (C.IO) for any invertible P, and any recursive 
linear realization can be written in vector difference equation (i.e. state 
space) form by keeping track of all memory updates. Does this mean that state- 
space realization solves the digital filter design question? The -answer to that 
IS decidely no. As we will discuss in the next section, there are many issues 
besides minimal storage involved in choosing a "good" filter structure (i.e. 
algorithm) . However, we know that one can obtain any minimal state space rea- 
lization algorithm via the choice of an invertible matrix P and the application 
/ 

of (C.IO). Does this mean that the selection of a "good" filter structure is 
equivalent to finding a "good" P? The answer to this is again no, and the pri- 
mary reason for this is distinction between interpreting a state space realization 
as a description of dynamical behavior and as an algorithm. We defer the 
clarification of this cryptic comment until the next section. 
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In control theory, state-space realxzations play a ma^or role in a number 
of very important design problems. In these problems the part played by rea- 
lization theory is indirect, in that it allows one to bring into play some 
powerful state-space design methods. We illustrate a few of these here. Con- 
sider the system pictured in Figure C.l, We are given an open loop pxm transfer 
function GCz) and we wish to design a feedback compensator that has certain 
properties. For example, one may wish to design a feedback system so that all 
of the modes of the closed loop system have time constants in a specified 
range. For scalar systems (p=m=l) techniques (in the frequency domain) for the 
solution to this problem have been available for a number of years [A-53,C-2], 
and frequency domain techniques for single input systems are discussed in Ec-2] , 
However, as discussed in [C-2] , if one uses a state-variable description of the 
system, one can obtain a solution in the general, multivariable setting. We 
briefly outline a method discussed in [C-14] . Let us suppose G(z) is proper, 
rational, and reduced (no element of G has common poles and zeroes) . In this 
case, let us find a realization 


x(k+l) = Ax(k) + Bu(k) 
y (k) ~ Cx (k) 


(C.19) 


and we note that the poles of G(z) are precisely the eigenvalues of A if and 
only if (C.19) is minimal. Suppose we implement a control law of the form 


u(k) = -Kx(k) 


(C.20) 



5 - 



<>t 
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®ien the closed-loop poles are ^ust the eigenvalues of (A-BK) . As discussed 
in [C-14] , we can find a K to place these eigenvalues sWherever we want if and 
only if (C.19) is controllable. A constructive algorithm is given in [C— 14] , 
Suppose we cannot implement Xc.20) — i.e. we only have u and y at our 
disposal. One might then consider the design of a system that estimates x from 
11 and y. A -natural structure for «uch -an ”bbserver’'. TC-14,15] as 


xCk+11 = Ax(k) + Bu(k) + H(y(k) - Cx(k)) (C.21) 

A A 

Note iiiat if x,{k) = xCk)^ then x(n) = x(n) , VTi>ky and if one looks at 

A 

the error e(k)=x(k)-Xi('k) , we find that it .obeys the equation 

^ik+i) ^ (A-HC]e(k) (C.22) 

.and the poles of A-HC can be placed arbitrarily if -and only if (C.19>) is 
4 

observable* Tf one then implements the control law 

u(k) = -i&(k') (C.23) 

one finds that the poles are just the eigenvalues of (A— BK) and (A-HC) , and we 
have solved the pole placement problem. This procedure illustrates one of the 
crucial -aspects -of 'many state space design methods — fhe solution to design 
problems is an algorithm , which, with some eases, ccin be implemented on a general 


Note that (A,0 is observable if ^d only if (A^,C') is controllable (see (C.i.1) , 

(C.12)), and that the eigenvalues of A-HC are the same as for A*-C*H'. Thus, we 
can use the .same algorithms for 'finding "H 'a^^t±st *used to "find ‘K m (C.20). 
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5 

purpose computer . 

Algorithmic, state space solutions exist to a wide variety of other 
problems — decoupling ("design a feedback law so that the ith input effects 
only the ith. output, [C-16] ) , invertibility ("when can we design a system that 
will take the output of our given system and recover the input [C-17]), etc. 

— and we refer the reader to the special issue of the tfikk Transactions on 
Automatic Control [A-68] for an overview of the various design methods that 
have been developed. One important aspect of some of these techniques is that 
they allow one to solve quantitative optimization problems, Hhe linear-quadra- 
tic optimal control problem is an example of this, as is the design of a Wiener 
filter as a steady-state Kalman filter [A-65,68,C-18] , Consider the estimation 
problem illustrated in Figure C.2. We have a Gaussian, stationary process y 
with given, rational power spectral density ^^(s) , and we observe the signal z, 

vrfiich consists of the sum of y and a Gaussian white noise process v. We wish to 
design a causal filter that minimizes the variance of the prediction error 

e(t) = yCt) - y(t) (C,24) 

As discussed in [C-18-21] , if we assume that we have an infinite record length 
on which to operate, the solution to this problem is the Wiener filter, which 
can be obtained by performing a certain spectral factorization. We also know, 
however (see [C-18] and Subsection B.2) , that the Kalman filter can be used to 


5 

The problem of computer design algorithms is a very important one at present. 
Difficulties with ill conditioning are present in many of these, and the design of 
"robust" algorithms is a crucial research question in control theory. See [C-22] 
for references on this subject. 


▼ 
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solve this pEoblem, Given that y has a rational power spectral density, we 
can find a minimal representation ("shaping filter") 

x(t) = Ax(t) + w(t) 

y(t) = Cx(t) (C.2S) 

zCt) = y{t) + v(t) 

Here E(w(t)w'(T)) = Q^-Ct-T) , E(v(t)v'(T)) = EiS(t-T), and w and v are independent. 
Also, (assuming stationarity) x(0) is zero mean with covariance which 
satisfies the ‘(continuous-time) Lyapunov equation 

APq + PpAt = -Q (C.26) 

Then, it is well-known [C-18] that the optimal filter is given by 


xCt) = Ax(t) + K(t) L^:(t)-Cx(t)] 
xCO) = 0, y(t) - Cx(t) 

where 

K(t) = P(t)C*R’^ 

and P IS the solution of the Riccati equation 

P(t> = AP(t) + P(t)A' - P(t)C’R“^:P(t) + Q 
P(0) = Pq 


(C.27) 


(C.28) 


(C.29) 


Equivalently, one could use one of the fast algorithms discussed in the preceding 
section to obtain K(t) directly. 
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In addition to providing a framework for the specification of designs, the 
state space framework allows one to analyze the performance characteristics of 
the overall system after it has been implemented. For example, the techniques 
described in Section A can be used to study the stability characteristics of 
the system. In addition, a subject of much interest is the sensitivity of such 
designs (see [C-23] and the references in [C-22] ) . The major emphasis here is 
that designs that come from state space algorithms are model-based, and deviations 
between true and assumed parameter values and the fact that the assumed model 
IS often an idealization of true system behavior will inevitably lead to varia- 
tions in the performance of the ’’optimal" design. Issues such as these have 
led to sensitivity studies and to the development of design methods which are 
adaptive (see the introduction to Section B) or inherently "robust" [C-24,25] 

(see also the discussion in [A-65] on the methods that are used to overcome 
sensitivity problems for Kalman filters) . 

Another analytical tool used to study system performance is covariance 
analysis. For linear systems, we consider the model 


xCk+1) = Ax(k) + w(k) 
y(k) = Cx(k) + v(k) 

where w and v are zero mean, independent white noises, 


(C.33) 


E(w(k)w(j)') = E(v(k)v(j)') = r6 

kj kj 


(C.34) 


These noises may represent actual noise sources or the effects of small non- 
linearities (such as quantization noise — see the next subsection) , unmodeled 
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Suppose we let — x.e. we consider the lirtat of an. infinite record 
length. One can show [A-65,c-18] that the algorithms 'for K{t) (Riccati or the 
feist algorithms) will converge to 

K = P^C*r"^ (C.30) 

where is the unique positive definite solution of the algebraic Riccati 

ecpiation 


AP + P A*- P C'r'^CP + 0=0 (C.31) 

00 CO 00 

Thus, the state space formulation provides several algorithms which solve the 
Wiener filtering spectral factorization problem to yield the optimal transfer 
function (from z to y) 

GCs) = C(Is-A+P^C'r“^C)"^P^C»e“^ (C.32) 

Thus we see that realization theory — in providing a state space model 
for the system to be controlled or the signal to be estimated plays an 
important role in allowing us to utilize rather powerful state-space algorithms 
for the specification of designs that possess certain performance characteristics. 
Note that all of these algorithms lead to designs that are specified in state 
space (e.g. (C.27)) or transfer function (e.g. (C.32)) terms. One must then 
face the issue of implementation. If the system is to be implemented in digital 
form, the issues raised in the next subsection must be considered in evaluating 
the performances of the overall system. 
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phenomena, etc. A simple calculation yields an equation for the covariance^ 
P{k) and S(k) of x(k) and y(k), respectively (assimiing x(0) is zero mean with 
covariance P(0) ) ; 


P(k+1) = AP(k)A*' + Q 
S(k) - CP(k)C + R 


(C.35) 


If A IS a stable matrix, we can evaluate the steady-state covariances P and 

6 

S by solving the Lyap^lnov equation 


APA'-P = -Q (C.36) 

In the nonlinear case-, a nvutiber of approxjjnate methods exist (see [A-65, 

C-26] ) , and we refer the reader to [C-27] for the discussion of one widely 
used method based on describing functions. 

As mentioned earlier, in implementing the designs that arise from state 
space methods , one must consider a number of issues that digital signal - 
processors have studied in great detail. On the other hand, it is possible 
that some of the analysis methods mentioned above can be of use in evaluating 
the performance of various system implementations, 

C.2 The Implementation of Digital Systems and Filters 

As discussed in [C-1] , the design of digital systems consists of three 

steps 

Specification of desired properties 


6 

As mentioned in Section A, this equation appears in several problems in state 
space system analysis. 
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. Approximation or realization of these by a causal, 
discrete-tiine system 

. Implementation of the system using finite precision 
arithmetic. 

From this point of view, the methods of the preceding section deal with the 
first two issues. Design procedures such as pole allocation and Kalman fil- 
tering specify desired input-output behavior for feedback compensators or 
optimal estimation. Realization procedures clearly play an indirect role in 
these techniques in providing the state space models on which the design 
techniques are based. But what about realizations from the point of view of 
system synthesis and imp'lementation? As we shall see, state space realizations 
can play some role in implementation, but they aire far from providing the 
entire solution. 

The digital filter design techniques we wish to consider are discussed 
in great detail in [C-1, 28-33 , d- 2] (see also the many references in these 
texts and papers) , and the ma^or emphasis of these methods as toward the se- 
cond and third tasks in digital filter design. The techniques for the second 
task, as described in [C-ll , take as their starting point the specification of 
certain frequency response or impulse response characteristics. The role of 
the second task is then to take these specifications and produce a scalar transfer 
function that meets these design specifications. An excellent description of 
the range of available techniques for this problem is given in [C-1, Chapter 5] . 

We will mention several of these methods but refer “the ^reader to this 'and the 
other references for a thorough treatment. A number of the methods that exist 
are based on transformation of analog filter transfer functions. One of these 
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is t±ie "impulse invariance" method in which one samples a continuous-time 
impulse response to obtain a discrete-time impulse response. This method -suffers 
from aliasing problems if the analog frequency response is not strictly band- 
limited. A somewhat more complex procedure which avoids the aliasing problein 
IS the bilinear transformation. 

l-z"^ 

sCz) = k ■■ ~ ~ (C.37) 

1-fz” 

which IS an invertible transformation of the z-plane into the s-plane which 
maps the inside of the unit circle in the z-plane onto the open left-half plane 
of the s-plane (thus preserving stability) . One can then transform an analog 
transfer function H(s) into a digital function 

H(z)^H(s(z)) (C.38) 

Note also that if H is rational, so is H. Also, this transforraation introduce 
nonlinear distortion in the frequency domain (the mapping of the unit circle 
in z onto the imaginary axis in s) , and care must be taken in achieving a design 
with the desired frequency response. 

In addition to these methods that yield closed form solutions, there are a 
number of computer-aided design methods. These include minimizing the mean- 
squared error between the actual frequency response and the desired response at 
a (finite) set of frequencies. Also, as mentioned in Section B, one can 

use linear prediction to “fit an all— pole model to a desired impulse response. 

In addition, the discussion of the preceding section suggests that the Fade 
approximation-partial realization algorithms described in [B-9-12] can be used 
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to find least-order pole-zero transfer functions that match a certain number 
of terms of a desired impulse response. 

There are also a number of methods used to design FIR filters. Many of 
these involve "windows”, in which one multiplies a desired impulse response by 
a finite duration window. The usual rectangular window leads to the well-known 
Gibbs phenomenon, and the more sophisticated windows have been devised to reduced 
this effect. The reader is referred to [C-1] for more on windowing and computer- 
aided methods for FIR filter design. In addition, for a good discussion of the 
issues involved in the overall design problem and of the design of optimum 
filters that approximate a given frequency response in the Chebyshev (L^) sense, 

we refer the reader to [D-2]. As these references indicate, a mmiber of filter 
design methods are algorithmic in nature (much as the state space design methods 
discussed in the previous subsection) , and the issue of efficient numerical 
design procedures is of central importance. 

Once an I3R or FiR filter has been determined, these still remains the 
major problem of implementation — the determination of a filter structure 
(algorithm) that realizes the given transfer function. One factor that does 
enter into this design question is the number of storage elements (delays) in 
the filter structure. Structures that contain the minimal number of delays are 
called "canonic", and this is clearly the same as the concept of "minimal" rea- 
lization. Of course, in dealing with single-input, single-output transfer 
functions , one can read off the order of a canonic structure and can constmict 
several quite easily by simple inspection of the specified transfer function 
( [C-1, Chapter 4] ) . 
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The determination of the order of a canonic realization and the ability to 

construct several minimal realizations without much difficulty barely scratches 

the surface of the structures problem. That is, the question of minimizing storage 

— which IS essentially what the state space realization problem considers — is 

3 ust one of several problems in digital filter implementation. As pointed out in 

[C-1] , the various filter structures available may be equivalent from an input- 

output viewpoint if one didn't have to worry about computation time, the con 5 >lexity 

y 

of the digital architecture or algorithm required to implement a given structure, 
the effect of finite precision in representing filter coefficients, or the effects 
of overflow and quantization. These are the issues that motivate much of the 
study of various filter structxires. It is not our intention to ejq>lore all of 
the various filter structures and the analytical considerations associated with 
them. We will mention a few, however, to illustrate several key points and refer 
the reader to the references [C-1, 28, 34] and to the many papers in the IEEE 
Transactions on Circuits and Systems . 

For FIR filters, a number of methods exist for the implementation of the 
finite convolution 

N-1 

yCn] = ^ h{k)x(n-k) (C.39) 

k=0 

(here h is the FIR) . Clearly, one can directly Implement the product by keeping 
the last N values of the input in storage. This is the so-called "direct form" 
realization [c-1] and requires N multiplications per stage. If one is desig- 
ning a linear phase network, this number can be cut in half by using the 
symmetry properties of the impulse response [C-1]. Also, the convolution (c.39) 
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can be implemented using fast Fourier transfom ‘(FFT) techniques, and this is 
particularly useful when N is large in which case one might use a sectioning 
algorithm [C-1,32,D-21 , In using FFT techniques, one often sacrifices storage 
in order to gain computational efficiency — e , g. we may take N to be a -power 
of 2 .or may use overlap sectioning methods [C-1] for efficient operation when 
the length of x is long. 

For IIR filters, a number of filter structures have been developed. In this 
case, we are attempting to realize the transfer function 


H(z) 


M 



-k 




(C.40) 


which is equivalent to the difference equation 


N M 

y(n) = ^ a^y.(n-k) + Y b x(n-k) (C.41) 

k=l ^ 1^0 ^ 

Hie direct implementation of equation CC.41) — called the direct form I 
realization — requires storage of the last N values of y and the last M values 
of u» This structure is far from minimal, as it is easily seen that the minimal 
number of delays is max(N,M). However, a slight modification of direct form I 
yields the canonic realization direct form II (see [C-l,p.l50] ) . 
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By examining the transfer function (Co40) , one can obtain a number of other 
canonic structures. For example, if H(z) is expanded in partial fraction form, 
we can obtain parallel form structures, while if we factor H{z) as the product 
of simpler transfer functions, we can obtain series or cascade structures. Let 
us give an exaitple of the cascade structure. Suppose we have 


H(z) 


2 

z -t- (b+d) z+bd 
2 

z -(a+c)z+ac 


d+bz"^) (14dz“^) 
d-az"^) d-cz"^) 


(C.42) 


in Figure c.3 we have realized this filter as the cascade of two first order 
filters in direct form II. Note that the overall filter ^ canonic. 

The major questions surrounding the choice of filter structure include 
the consideration of computational efficiency, the effects of finite word length 
on filter stability and performance, and the effect of finite precision in 
representing filter parameters. We have already said a few words concerning 
computational efficiency, and refer the reader to the references for more on 
this issue (in particular, see tC-1,32] for detailed discussiqns and further 
references on the use of the PFT algorithm)^ . In addition, in Section A we 
considered the effects of quantization and overflow on system stability. An 
alternative, approximate method for evaluating the effect of finite word length 


2Ui interesting question in the area of computational efficiency is the determi- 
nation of filter structures that require the smallest number of delays and multi- 
plies. For second order transfer functions Lueder [C-50] has shown that there are 
precisely 32 such structures. An intriguing related question in the state space 
area is the detesnnination of a realization in which A,B, and C have as few 
elements as possible that are not 0,1, or -1. As far as we are aware, no work 
exists on this problem. 




Figtire C»4; An n-th Order Cascade Filter Including Quantization Noise 
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on system performance is to model each quantization as if it introduced noise 
(representing, for example, roundoff or truncation) into the system. This 
approach is discussed at some length in Ea-3,12,c- 1,33737] . The basic idea 
IS that whenever a quantization occurs, one replaces it by an "equivalent noise 
source". Then, by assuming independence of these various sources — a rather 
strong cind many times m^ustified assumption (as the existence of periodic ef- 
fects, i.e. limit cycles, indicates) — one can in principle evaluate the over- 
all noise power at the output, and thus can obtain a measure of the size of 

8 

^antization effects. As an example, consider the case [C-1] of fixed-point 

arithmetic and roundoff quantization (Figure A. 5) in which the quantization 

^ -b 9 

interval q is 2 (i.e. the niuriber of bits used to represent fractions is b) , 

In this case, the quantization error e introduced by a single multiplication 

falls in the bound 

“ j 2“^ < e < j 2"^ (C.43) 

If one makes the assumption that e is- uniformly distributed, we find that it has 


Parker and Girard IC-55] have shown how one can take the correlation in these noise 
sources into account. Specifically, quantization noises due to multiplication of 
the same signal by two different coefficients are correlated, and the correlation 
ban be approximated by a function that depends on the coefficients.. In addition, 
Parker and Girar point out that correlation increases as the number of bits decreases. 
We also refer the reader to the work of Eckhardt and Schussler [C-56] on evaluating 
quantization error variances. 

9 ^ 

Here, we follow the standard fixed point procedure in which all numbers are 
represented as fractions. One can also consider noise analysis for floating point 
[C-1,33]. See also the work of Fettweis [C-52,53,54]in which noise analysis is 
performed with the aid of certain system sensitivity functions. 
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zero mean and varxance 


2 

a 

e 


1 _ 2 - 2 ^ 
12 


(C.44) 


Using these assimptions, one can add independent noise sources to filter repre- 
sentations to account for quantization effects. For example, in the cascade 
example of Figure C.3, one could add one noise source following each of the 
four multiplications (fewer noise sources might result from a different quanti- 
zation procedure — e.g, if we add the products bx^ and cx^ before quantizing) . 

Another extremely important issue in filter design is the sensitivity of 
filter performance to variation, in coefficients. This is quite central an 
issue, since one can only represent coefficients up to a finite degree of ac- 
curacy, and hence one cannot obtain filters with arbitrary pole and zero locations. 
As described in [C-1 (Chapter 4) ,c-28,3l] , the allowable poles, and zeroes and the 
sensitivity to variations in parameters depends quite significantly on the 
particular structure \mder consideration. For example, parallel and cascade 
structures are often used because of their sensitivity properties, since the 
perturbations in the poles are isolated from one another [C-1] . 

A great deal of work [C-1, 28-31, 33-37] has gone into developing methods 
for answering a variety of questions concerning various filter structures. 

Questions considered include: (1) the determination of the number of bits needed 

in a given filter structure to obtain required accuracy in overall performance 
both from the point of view of parameter sensitivity and quantization noise; and 
(2) determination of "rules of thumb" [C-33,37] for the pairing and ordering of 
poles and zeroes in a cascade structure in order to minimize the effects of 



121 - 


quantxzation noise. The study of questions such as these for large interconnected 
networks is a complex problem, and efficient algorithms are needed to evaluate 
overall sensitivities, effects of noise, etc. One such large-scale package in- 
volves the use of techniques for the manipulation of signal flow graphs. The 
use of such techniques is discussed in [C-1,28,31], and a detailed description of 
a computer package to perfoirm a number of types of analysis on digital networks 
IS contained in [C-28] . 

For the remainder of this section, we wish to examine the relationship of 
state space techniques and concepts to some of the questions in digital filter 
design, OSiis discussion is a first attempt to study such relationships, and a 
great deal more work is needed before the issues can be thoroughly understood. 

Let us first examine the use of state space techniques to determine filter struc- 
tures. As described in the preceding subsection, realization techniques can be 
used to obtain minimal realizations — i.e. certain canonic algorithms. Consider 
the transfer function (C.42). In this case, state space techniques yield a 
variety of minimal —(in this case two-) dimensional realizations of the for^i 


x(k+l) = ExCk) + gu(k) 
y(k) = h'x(k) + u(k) 


(C.45) 


where 


h* (zI-F) 



2 

z +(btd)z+bd 
2 , 

z -(a+c)z+ac 


(C.46) 
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(C.47) 


Let us interpret (C.45) as an algorithm. Assume we presently have computed 
x(k) and receive the new input u(k) 

PART #1; (a) Multiply and x^(k) 

(b) Multiply h 2 and x^Oc) 

(c) Add these, together with u(k) to yield y(k) 

PART #2 t (a) Multiply and 

(b) Multiply and x^(k) 

(c) Multiply f ^2 ^2^^^ 

(d) Multiply f ^2 X 2 (k) 

Ce) Multiply g^ and u(k) 

Cf) Multiply g 2 and u{k) 

(g) Add (a) , (c) , and (e) to yield x^{k+l) 

(h) Add (b) , (d) , and (f ) to yield X 2 (k+1) 

clearly a number of these steps can be done in different orders, but the above 
steps do indicate the basic algorithm implied by (C.45) . Note that in general, 
there are 8 multiplications and 6 additions required. 

Now let us examine the cascade structure of Figure C.3, and let us interpret 


It as an algorithm: 
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(a) 

Multiply a and 

(k) 

(b) 

Multiply b and x^ 

(k) 

(c) 

Multiply c and x^ 

(k) 

(d) 

Multiply d and X 2 

(k) 

(e) 

Add (a) and 

u(k) 


(f) 

Add (b) and 

(e) 


(g) 

Add (c) and 

(f) 


(h) 

Add (d) and 

Cg) 


(e) 

= Xj^(k+1) 



(g) 

= X2(k+1) 



(h) 

- y(k) 




Note that these algorithm requires 4 multiplications and 4 addition?, but this 
IS not the most crucial difference between the two algorithms, since it is 
possible to obtain realizations (Ce45) with some zero elements in (F,g,h) . 
However, the crucial difference is the following: if one interprets a state 

space realization as determining an -algorithm of the type indicated, then there 
IS no way that the cascade algorithm is of this type ! This is not to say that 
one cannot find a state-space description of the cascade realization. In fact 



a 0 


1 

x(k+l) = 


x(k) + 


- 

(a+b) c 


1 


(C.48) 


y(k) = [(a+b), c+d)]x(k) + u(k) 




-124- 


is such a realization. Note that if one takes into account that one doesn't 
have to multiply by 1 and that one multiplication is used twice, then (C.48) 
requires only 4 multiplications 


ax^ (k) 
(a+b) (k) 

ex^ (k) 
(c+d)x2(k) 


and 5 additions. 

The point made above may, at first glance, seems to be trivial, but it is 
not, since it points out that although any (infinite precision) algorithm can 
be describe dynamically in state space terms direct interpretation of a state 
space description as an algorithm does not allow one to consider all possible 
algorithms. That is, it is relatively easy to go from an algorithm to a state- 
space description — e.g. (C.48) — but it is not at all natural Or clear how to 

i 

go the other way, and hindsight is needed in order to interpret the realization 


x(k+l) 


f 


- - 

11 0 

x(k) + 

1 

f 


1 

21 22 



- 


_ ^ 


y(k) = [f 2 j^rh 2 ]x(k) + u(k) 


(C.49) 


as a cascade structure with 


^ ^11' ^ ^2l“^ll' ° ^22" ^ ^2"^22 


(C.50) 
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Recently, Chan [D-107] defxned a unified framework for the consideration 
of all 1-D structures. Chan noted that if one viewed (C.45) as a map from present 
state and input to next state and present output 


- 


* “ 

xOc+l) 


a(k) 


= $ 


y(k) 


u(k) 



_ 


then any filter structure can be viewed as a factorization of ^ and a change of 
basis on x. Specifically, consider the example CC.46), with the realization 
(C,48) , which yields 


■ a 0 1 ' 

(a+b) c 1 

. (a+b) (c+d) 1 - 


Let us write , where 



r 




a 0 1 


10 0 


b c 0 

0 d 0 

to 

II 

110 

111 




- 


Then, if we interpret this factorization as an algorithm — perform the operations 
indicated by first and then perform those specified by # — it is clear that 

we essentially have the cascade algorithm as depicted in Figure C.3. Thus Chan's 
technique provides a conceptual framework in which to consider structures from a 
state point of view. As Chan points out, it is not yet clear how one can use this 
factorization technique in ah algorithmic fashion to determine useful new 
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structtires. At the very least, xt provides a tinified framework for the consi- 
deration of questions related to realization structures, 

Thus, we see that there are potential limitations to the state space frame- 
work for determining new filter structures , although the ideas of dian may provide 
a conceptual unification of these subject areas. In addition to Chan’s work, there 
appear to be several other structures— related areas in which state space concepts 
may play a role. Recall that state space realization techniques allow one to determine 
minimal realizations for systems with multiple inputs and outputs. It is possible 
that this fact, combined with a thorough underatanding of the relationship between 
state— space realizations and various digital system structures will lead to the 
development of useful filter structures (possessing desirable storage, computa- 
tional, sensitivity, and quantization characteristics) for multivariable systems. 

It IS hoped that the preceding treatment of a simple cascade example will help 
expose some of the issues that need to be understood. 

Also, as mentioned in the preceding subsection, the state space framework 
IS particularly useful for the analysis of the properties of dynamical systems. 

Thus, It seems natural to ask if these techniques might be useful in the analysis 
of various filter structures. We have already discussed this question in Section 
A with respect to stability analysis techniques. It is also possible that state- 
space sensitivity techniques [C-23] could be useful in the study of the sensiti- 
vity of various digital filter structures, but this awaits further study. 

Finally, let us examine the utility of state-space techniques in the analysis 
of the effect of quantization noise on filter perfo 2 ntiance. We do this by example, 
although it should be clear that this approach extends to arbitrary structures. 
Consider the cascade stmicture in Figure C.4, Here we have included quantization 
noise after each multiplication. A state space representation of this filter can 


be written down by inspection 
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x{k+l) = Ax(k) + bu{k) + re(k) + Af(k) 
y(k) = c'x(k) + u(k) + 0e(k) + (k) 

where 



' (k)" 


1 


• (aj^+b^) - 

x(k) = 

o 

• 

• 

f b = 

• 

• 

1 

/ c = 

• 

• 

• 

_(a +b ) - 
n n 


0 = 'F = 

0 0 

^ ^ ^2 ^ 

• « • 

» * • 

• « • 

(a +b ) (a +b ) a 

11 z ^ n 



O 

• 

• 

• 

• 

o 

1 


0 0 .... 0 0 

r = 

1 1 0 

• * • 

, A = 

1 0 .... 0 0 

• • • 

• • • 

• * * 


• « * 

1 X 9 • • • X 


• • ♦ 

1 1 .... 1 0 _ 


(C.51) 


(C.52) 
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e(k) = 


(k) 


e (k) I 
L- n — 1 


f(k) = 


fj^(k) 


f (k) 
„ n 


also, the noises e, ,...,e , are assumed independent, identxcally- 

1 n 1 n 

distributed, zero-mean white processes with variance (C.44). Then, assuming 
that a IS a stable matrix and using the covariance analysis procedure described 


in ‘Subsection C.l, we can compute the steady-state covariance ^ of y 


10 


= c'Po 

6 


(C.53) 


where P, the covariance f f x, is the solution of the Lyapunov equation 


P = APa’ + [rr'+AA'l 


(C.54) 


Equations (C.54) and '(C.53) are perfectly suited to computer inplementation. 

Also, note that the solution of '(C.'54) yields the effect of noise throughout 
the network. The utility of an approach such as this for digital network analysis 
needs to be examined more carefully, but it appears that it may be computationally 
superior to other methods, such as those that require computing a number of 
partial transfer functions (from each noise source to the output — see [C-5] ) , 

We also note that if the noise sources are correlated, as they are shown to be 
in [C-55] , one can adapt the preceding procedure by augmenting the filter state 


10 

Here we have assumed u=0. The analysis of the deviation of y from the desired 
value when u?^0 is identical to the above (assuming that e and f are independent 
of u) o 
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equations with a shaping filter that yields the correct correlation in the error 
sources. We note that Parker and Giraid [C-55] used Lyapunov-type equations and 
analysis quite similar to our development for the evaluation of output noise power 
due to correlated quantization errors. In addition, similar analyses have been 
undertaken by Hwang tC-64] , Mullis and Roberts [C-65] , and Srxpad and Snyder IC-66, 
67], Hwang uses Lyapunov-state space equations to study the effects of possible 
structure transformations and state-amplitude scalings; Mullis and Roberts use a 
similar framework to study what they call "minimal noise realizations"; and Sripad 
and Snyder develop conditions under which quantization errors are in fact white, 
and they also use Lyapunov-type analysis to compare the performance of two different 
realizations. These references clearly indicate the potential benefits of this 
type of analysis. 

Within the framework described above, one can pose a number of other questions. 

11 

For example one can perform a similar noise analysis if random rounding is used. 

Also, Schussler [C-51] has proposed a figure of merit for structures — the required 
number of bits to meet given noise specifications. In terms of (C.53) and (C.54) 
this would mean determining b so that the resulting 2 is less than some prescribe 
limit. Is It possible that we can devise algorithms for the solution of such problems 
for this and for more general structures? In addition, in the case of floating point 

arithmetic, the quantization error depends on the size of the signal. Gan state- 

( 

space procedures for analyzing "state-dependent noise" [C-57,58] be of value here? 
Questions such as these await future investigation. 

C.3 Direct Design Taking Digital Inplementation Into Accoimt 

As discussed in the preceding subsections, design procedures in both dis- 
ciplines consist of several parts — determining the desired input/output 

11 

As Schussler [C-5l] points out, one often designs filters with limit cycles, since 
filters without limit cycles often have poor noise behavior, and one can overcome 
the limit cycle problem by using randomized rounding (hence adding a bit more "noise" 

the system) . 
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behavxor to be synthesized and the design of an algorithm that approximate 
this behavior given the constraints of digital implementation. The procedures 
as described up to this point treat these issues separately, but, as discussed 
in [D-2] , it would be of value to consider overall design methods that take the 
discrete nature of the computer into account during the process of developing 
design specifications and allow the study of tradeoffs such as performance versus 
n\nnber of bits used. The development of full-fledged design procedures is 
clearly a long way off; however, in recent years some research in control and es- 
timation theory has been aimed at developing designs that reflect the interaction 
of system specification and the limitations and structure of the digital system 
that is to be used to implement the system. ' We will briefly described several 
of these and refer the reader to [C-38-46,D-85,86,87,88,89,91,92] for details. 

Consider the continuous time linear system 

x(t) = Ax(t) + Bu(t) (C.55) 

where xeR^, uer”^. Suppose we wish to control the system with a digital 
control system. Specifically, suppose we can observe x{k) = x(kA), k=l,2,.,.., 
and, based on these observations, we feedback a control 

u{t) = u(k) kA<t<(k+l)A (C.56) 

In addition, suppose we wish to design the control law to minimize 
00 

J= y* [x*(t)Qx(t) + u' (t)Ru(t)]dt (C.57) 

0 
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For a fxxed value of A, thxs problem leads to an optimal discrete- time control 
problem [C-39] with the feedback law 

u(k) = Gx(k) (C.58) 

Suppose this law is to be implemented in a digital system that takes 

seconds to perform a multiplication. Then (assiiming add time is negligible) , 
in general the control law requires 

A>nmT (C.59) 

— m 

Thus, it IS clear that each control algorithm requires a minimum time 

A between successive samples, and the following question arises: suppose 

min 

■we consider a " subopt imal" control algorithm -that can be implemented at a 
faster sampling rate than the bound for the "optimal" law in (C.59) ; is it pos- 
sible that such a law can outperform -the slower, "optimal" law. This question 
IS answered in the affirmative in [C-38] , in which a simple example is given and 
an indication is given that one can achieve performance improvements for a class 
of large-scale, "loosely-coupled" systems. One can also interpret these results 
as providing a method for determining the value of a faster computer, as measured 
by the accompanying decrease in J — i.e. for a given control law and two possible 

multiplication times T , T (t <x ) the cost difference J(x )-J(T ) can be 

X 2 1 2 2 M 

interpreted as the amount one would be willing to pay for the faster machine. 

This can provide a basis for a tradeoff analysis — the cost of a faster computter 


versus achievable performance improvement 
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The questxon of devising control and estimation designs and digital 
architectures that are especially natural for particular applications is 
receiving more and more attention as available digital systems are being im- 
proved and made less expensive. Specifically, the development of microprocessor 
technology has led to a great increase in the design of" control and estimation 
systems that involve a number of identical inodules, parallel structures, and 
distributed processing [B-103,104,C-40-46] . in the area of decentralized control 
[D-85, 86, 87, 88, 89, 91, 92] one often has an extremely large and distributed system 
with many inputs and outputs, and one wishes to design a set of "local" controllei 
— 1 . 6 , a set of several control laws, each of which uses only some of the inputs 
and some of the outputs and is perhaps implemented on a dedicated processor. 
Clearly the architecture of such a system (i.e. who gets to know what) is a mag or 
design variable. Again one can interpret the difference in performance of two 
different architectures as a measure of how much more one would be willing to pay 
for one system than another. Clearly a totally centralized system would perform 
best, but the cost of relaying all information to and from one central location 
may be prohibitive. ^ 

The study of problems such as this — i.e, the interaction of implementation 
and architecture issues (parallelism, decentralization) and the design of control 
and signal processing algorithms — is still in its infancy, and it appears to 
offer an extremely promising avenue for research and for applications to problems 
in fields such as aircraft control [c-40, 42-44] and nonlinear stochastic filtering 
[C-45,46], We note that architectural issues have received a great deal of at- • 
tention in the field of digital signal processing [C-28,31,47,48] , and this 
appears to be a promising direction for future interaction' and collaboration. 
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We also note that there has been work tC-68,69] m digxtal filter design 
aimed at developing structures and design techniques that take the constraints 
of finite arithmetic into account at the start . In addition, the restrictions 
of finite arithmetic have, in part, motivated the study of linear systems in 
which the vectors and scalars are all integer valued — i.e. linear systems over 
rings tE-28-31] , The work to this point has been quite theoretical, and its value 
in allowing one to design digital controllers or filters has yet to be established. 

Finally, we note that in both disciplines there is a great deal of interest 
in the development of fast on-line algorithms. In digital signal processing, 
fast Fourier Transform algorithms [C-1,59,60] have been widely used (for example, 

f 

in the implementation of FIR filters) . The FFT has also found use in control 
theory (see, for example, its use in implementing matched filters for detection 
of failures in dynamic systems [C-61] and in designing efficient optimal con- 
trollers for certain large interconnected systems that possess some symmetry in 
their structure [D-93] ) , In addition, motivated by the algebraic -treatment of 
Nicholson [C-60] , Willsky IC-62] has developed fast algorithms for several types 
of "noncomrautative convolutions" that occur in certain nonlinear filtering problems 
(see also [C-63] ) . Also, all of the fast Kalman gain algorithms discussed in 
Section B are potentially useful in the design of efficient adaptive control 
systems. The implementation of systems along these lines and the development of 
new efficient on- and off-line procedures remains an active area of research in 


both disciplines. 
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D. Multxparameter Systems, Distributed Processes, and Random Fields 

\ 

A growing interest has developed over the past few years into problems 
involving signals and systems that depend on more than one independent va- 
riable. In some cases one of these variables is time, and the others represent 
spatial dimensions — as in the study of distributed parameter systems 
[D-157-159] or decentralized control [D-87,88,93,94] — while for other pro- 
blems — such as image processing [D-4,6,7,20 ,21] — none of the independent 
variables can be thought of as time. 

This research area is rich in both potential application areas and in 
challenging theoretical problems. Among the areas of application are image 
processing, seismic signal processing, meteorology, gravity field mapping, 
pollution monitoring and control, and inertial navigation. On the theoretical 
side, there are a number of basic conceptual questions. How does one handle 
the processing of distributed data in an efficient manner? What properties 
do recursive techniques have in a setting where the recursion is in more than 
one dimension? Do causality and state make any sense here? What about 
stability? What are the tools for analyzing stochastic processes? How do we 
"predict” when the independent variables aren't time? What role do recursive 
estimation techniques play (what are recursive estimation techniques?)? Which 
concepts concerning signals and systems in one independent variable carry over 
to the multiparameter case? Which do not, and why don't they? 

In this section we consider several problem areas involving multiparameter 
signals and systems in order to discuss some of the issues mentioned above in 


more detail 
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D.l Two Dimensional Systems and Filters 

Over the past few years, a great deal of work has been done in attempting 
to extend one-dimensional filtering concepts to the design and analysis of 
systems that process data that is distributed in two-dimensional (2-D) arrays. 

The consideration of 2-D systems has opened up an entirely new set of 
questions,, and in this section we want to explore some of these design and 
analysis issues. For an excellent and thorough overview of 2 -d digital fil- 
tering, we refer the reader to [D-3] . 

As in 1-D, we can define a linear shift-invariant system (LSI) that 
processes 2-D input arrays x(m,n) to produce 2-D output arrays in a linear 
fashion and so that a shift in the "time" origin for the input merely induces 
an analogous shift in the output. Such a system has a convolutional repre- 
sentation, much as in 1— D 

y(m,n) = 52 h{m-k,n-£)x(k,A) = h(m,n) *x(m,n) = x(m,n) *h(m,n) (D.l) 

Here h(3,k) is the unit sample response — i.e. the response of the system 
to the input 

x{k,W = (d.2 ) 

(here 6^^ is the Kronecher delta which is nonzero and equals 1 only if 1=3). 

The unit sample response is sometimes referred to as the point-spread function 
[D-4] , a term used in image processing, where h(3,k) has the interpretation 
as the observed image when the input illumination is a point source at the 


origin 
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Agaxn as xn the 1-D case we can tahe z- transforms^ For example, the 
system function of (D.l) xs 


+00 


H(z ,z ) = h{k,Jt)z 


-k -il 
^2 


(D.3) 


and a sxmple calculatxon transforms (D.l) xnto 




(D.4) 


An xmportant class of LSI systems arxs® from ratxonal system functxons 


H(z, ,z^) 


A(Zi,Z2> 


1' 2 bCz^^z^) 


(D.5) 


A(z^,z^) = 


E 

(k,^)ei. 


a(k,S,)z^ ^ 


(D.6) 


B(z ,z ) = S b(k,Jl)z ^z ^ 

^ ^ (k,£)ei2 ^ - 


where I , I„ are fxnxte sets of paxrs of xntegers. As a strxghtforward 
consequence of (D.4)- (D.6), we obtaxn a 2-D (partial) dxfference equatxon 
relatxng y and x; 


E b(k,^)y (m-k,n-&) = E a(k,5^)x(m-k,n-£) 


(D.7) 


(k,£)ei. 


(k,£) ei. 


Up to thxs poxnt, the mathematxcal steps taken follow the 1-D steps 


very closely, but now we begxn to see some of the conceptual as well as 



-137- 


mathematical difficulties that arise in the 2 -d case. Het us first discuss 
the problem of recursion. Given the equation (D.7 ) , we want to use it to 
calculate the next output given previous outputs and the input. Enbedded in 
this statement is the heart of one of the problems. Unlike the 1-D case, in 
which the index n has the interpretation «of time, in the 2-D case, in general, 
it IS not clear what "next" or "previous" mean.^ In fact, just given (D.7) 
it IS not clear that there is any definition of next or previous that will 
allow us to recursively compute y(m,n). Dudgeon [D-3,5,33] Pistor [D-42] , 
and Ekstrom and Woods [D-103,119] have studied this problem in great detail, 
and we now briefly overview their work. 

First note that in the nonrecursive (FIR) case — i.e. when B=l, there 
IS no problem in confuting (D.7) output point by output point. There is, 
however, an issue concerning what part of the input must be stored at any one 
time. In 1-D, we just keep the most recent ingut points (assuming we compute 
y(n) sequentially), but the situation is more complex in 2 -d. For example, 
suppose we have the "nearest neighbor" filter [D-31 ] t 

Ij^ = {(-1,0), (0,0), (1,0), (0-1), (0,1)} (D.8) 

then to compute y(m,n) we need x(m+l,n), x(m,n) , x(m-l,n), x(m,n+l) , x(m,n-l) 
Conversely, we must hold on to x(m,n) until we have computed y(m-l,n) , y(m,n), 
y(ratl,n), y(m,n-l), y(m,n+l). Thus, depending on the order in which we compute 
the y's, we can have very different requirements for-storing the x's. Here we 
get our first glimpse at the fact that the required storage depends not only 


^Unless one of the two dimensions 2 ^ time and we wish to process the input in 
real-time. We will have more to say about this later in this section. 
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on the degree of the filter but also on the sequencing of computations. Of 
course for FIR filters, as in the 1-D case, we can process inputs in blocks, 
using a 2-D PFT algorithm together with an appropriate method for taking care 
of the overlaps in the blocks. Methods along these lines exactly parallel 
the 1-D methods, and we refer the reader to [D-3] for further discussion and 
references. 

Thus, we have that the right-hand side of (D.7) does not raise any 
insu 2 miountable obstacles for the sequential processing of inputs (although 
there are several interesting (questions as we've seen). The situation is far 
different in the recursive case (Bj^ constant) . Since the right-hand side of 
(D.7) causes no difficulties, we assume that it is trivial (A=l) for conven- 
ince. Let us consider one of the most widely used special cases of (D.7) : 

M N 

VI V b (k,£)y (m-k,n-^.) = x(m,n) (D.9) 

k=0 Jl=0 

Assu3siing that b(0,0) ^ 0 , we have 


y(m,n) = ” ^ToT^) ^ ^b (k,Jl)y (m-k,n-£) + ^ ‘ (o""o) (D.IO) 

(k,jl)5^(0,(5) 

and we immediately see that to calculate y(m,n) , we only need the values of 

2 

outputs to the "southwest" . Figures D.1-D.4 illustrate the situation. 


'This terminology appears to be due to Pis tor [D-42] . It seems to be particularly 
appropriate for conveying the geometry of 2-D recursions and causality," 
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Figure D.l t Si:^port of a First Quadrant or "Northeast” (NE) Function. 

(Possible nonzero locations are indicated by solid dots . ) 
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Figure D.2 - Required Output Points (Open Dots) to Calculate y(m,n) for the 
system given by (D.9) 



Figure D.4; Several Possible Directions of Recursion for (D.9) 
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In Figure D.l, we see that the support of the fimction b(k,it) is in the 
first quadrant- We will call such a function a "northeast" (NE) function 
for reasons that will become clear shortly. In Figure D.2 we see the set 
of data points to the SW that must be stored in order to enable us to calculate 
y(m,n). A consequence of this is seen in Figure D.3.^ If we are interested 
in calculating y(m,n) in the NE quadrant, we must specify initial or botmdary 
conditions as shown. As we calculate and store some of the output points, we 
can discard some of the old values, but it is clear that the amount of storage 
needed depends not only on M and N in (D.9) but also on the range of values of 
m and n for which we want to calculate y. If either of these ranges is 
infinite, the storage needed is infinite. 

In addition to this consideration, we also find that the storage require- 
ments depend on the sequencing of the recursion (we had seen this earlier in 
the FIR case). Several directions of recursion are indicated in Figure D.4, 
Here (a) depicts the north recursion, (b) is the east recursion, and (c) is a 
NE recursion. We can generate- other directions of recursion as long as they 
remain within the NE quadrant. Each recursion calls for its own sequence of 
data accessing and discarding. The N and E recursions appear to have parti- 
cularly simple sequencing rules, but the data must be processed serially. 

On the other hand, the NE recursion has a more complex sequencing but leads to 
the possibility of parallel computation, since, for example, points 4,5, and 6 
can be calculated simultaneously. The possible directions for recursion and 
potential uses of parallel computation can be determined with the aid of a 


conceptual device — the precedence relation, which partially orders points 
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with the rule 


(m,n) ’^(5-,k) if y(m,n) is needed to be calculated first in order to 
be able to calculate y(il,k) (D.ll) 

Thus {m,n) (il,k) if y(m,n) is directly needed to calculate y(£,k) or if it 
IS used to calculate some y(r,s) that is used directly to calculate y(5.,k), 
etc. A discussion of this topic has been given by Chan [D-107,152], We 
will come back to this issue later in this section. 

Let us now return to the question of recursibility. Clearly the picture 
IS symmetric — i.e. we can have NW, SE, and SW recursions, with b(k,)l} res- 
tricted to be a function on the corresponding quadrant. However, as shown by 
Dudgeon [D-5,33], this by no means exhausts the possibilities for recursion. 

In addition to the one quadrant functions, we can obtain recursive difference 
equations with b{k,it) 's that are one-sided [D-5] . To illustrate the idea, 
consider the equation 


y{m,n) 


1 

b( 0 , 0 ) 


M N N 

23 23 l 3 (k,Jl)y(m-k,n-£) •'^- 777 - 77 , XI b( 0 ,S,)y (m,n-il) 

i^l A=1 


Figure D.5 illustrates the support of such a function, while Figure d .6 
indicates how points are recursed and what initial conditions are needed. 

Here we calculate the data points column by column, using data points to the 
south and to the west (not ^ust the southwest) . Hence the directions of 
recursion are far more limited than in the single quadrant case, since 'we cannot 




Figure D.5; Support of a One-Sided Function 
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Figure D.6 ; Illustrating the Required Initial Conditions and the 

* Direction of Recursion for the Filter of Equation (D.12) 
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move east \mtxl all of the data points required in the present coltiinn have 

been calculated. For more details, we refer the reader to [D-5, 33,100] , xn 

» 

which related issues are discussed, such as the rotation of the support of one- 
sided or one-quadrant functions to obtain recursions at various angles. 

Thus, we have seen one place in which the 2 -d case is more complex than 
in one-dimension : the notion of recursibility and some of its geometric 
interpretations. One can avoid many of these difficulties by sticking to 
nonrecursive designs, but recursive techniques offer enough potential ad- 
vantages in computation time and storage to warrant further detailed study. 

Let us make another connection with 1-D processing. Suppose that one of 
the two indices, say m, has the interpretation as time. Then one might think 
of y(m,n) and x(m,n) as (1-D) spatially distributed processes that evolve 
in time. Temporal causality might then correspond to the support of b in 
Figure D.5 being modified by deleting the points on the positive n axis, 
yielding a "strictly" one-sided function. In this case, one could define the 
"state" of the system, and it is clear that this "state" will be finite 'dimen- 

f 

/ 

sional only if the range -of n is bounded, which is precisely when the required 
storage for the 2-D recursion is finite. This clearly shows why the order of 
a 2 -d filter does not specify the storage requirements by itself, but one 
must also know the range of m and n. Hence we see that 2 -d digital filtering 
of scalar (or perhaps vector) variables bears* some resemblance to the 1-D 
state space framework for multi- and possibly infinite dimensional systems that 
arise in multivariable and distributed system and control theory. 

An intriguing question is: can this interxelationship be exploited to 


yield useful insights and/or results on either or both sides of the coin 
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The answer xs, of course, yes. Such problems arise in seismic signal pro- 
cessing, in which the data to be processed x(m,n) varies in time (m) and 
also in array sensor location (n) . We refer the reader to the references for 
more on this problem. Encouraged by this example of the successful exploi- 
tation of the 2 -d, multivariable 1-D interrelationship, one can ask a number 
of rather speculative questions. In large-scale system theory, we often 
have a number of subsystems coupled together, and one is interested in efficient 
processing of data and control of such systems. Viewing the variables as 
functions of two independent parameters — time and sxibsystem index — can we 
obtain any insights into the control and processing of large systenswith the 

aid of 2 -d digital filtering concepts? Note that this would involve the con- 
. 3 

sideration of feedback for 2-D systems, a topic that, to our knowledge, has 
never been addressed in the digital filtering context (with good reason — it 
IS irrelevant for usual 2D processing problems) . We have been somewhat vague 
about this topic, but we shall return to this large system- 2D_filter idea 
several times in this section, as these are a number of interesting insights 
and questions that can be raised. Another possible use of 2-D concepts for 1-D 
problems is in the analysis of time-varying 1-D systems, in which one can 
define a system function in two variables -- a transform variable and time. 

Such concepts may also have value in developing time- varying linear prediction 
algorithms. On the other side, in order to study questions such as stability 
or roundoff noise behavior for 2-D filters , is there any benefit in viewing the 
2-D filter as a multivariable 1-D system? Capetenakis [D-90] has begun such 
an investigation for NE filters (not strictly one-sided) . Although his results 


3 ' 

Of course, causality constraints would have to be built in. For exaJi 5 )le the 
feedback from y to x would also have to involve a strictly one-sided recursion. 
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do not yield any new results, they are very preliminary, and thrs also 
remains as a possible direction for further work. 

As mentioned earlier and as discussed in [D-107,152] , the ability to 
solve a 2 -d difference equation recursively leads directly to the definition 
of a partial order (D.ll) on the part of the 2-D grid over which we wish to 
solve the equation. Given this partial order — the precedence relation — , 
one then has some freedom in deciding how to sequence the calculations. 
Specifically, if we think of a sequence of calculations as determining a total 
order (denoted by <) on the part of the 2-D grid of interest, all we require 
IS that this total order be compatible with the precedence relation. That is 
(m,n) is calculated before (A,k) (written (m,n)^(iJr,k) ) if (m,n)-^ (il,k) . Manry 
and Aggarwal [D-56] have studied such order relations for NE recursive filters. 
One of their first observations is the following; given a compatible total 
order the first quadrant can be put into a 1-1, order preserving correspondence 
with the nonegative integers: 

Q(m,n) = r <=> there are precisely (r+1) points in the NE 

quadrant ^(m,n) (D.13) 

Given the function Q, one can think of (D.IO) as determining a 1-D filter, with 
(m,n) replaced by Q(m,n), etc. Alteimatively, given the ordering (D.13), we can 
think of processing the input x(m,n) with a linear time-invariant 1-D filter. 

One finds (see also Mersereau and Dudgeon [D-3,55]) that in general neither of 
these filters — the 1-D filter obtained from (D.IO) and (D.13) or the 2-D 


filter obtained from a given LTI 1-D filter and (D.13)~ are shift-invariant 
invariant (they are, of course, both still linear). 
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Let us examine several orders. Manry and AggarwaL suggest the order of 
Figure D-4(c), since every point in the NE quadrant is mapped by (D,13) into 
a finite integer.. The orders suggested by Figure D.4(a) or (b) are well-posed 
only if the desired range of one of the variables m or n is finite. In this 
case, we obtain several possible orders, as given in Figure D.7. In both 
cases, the range of values is limited in the n direction. Manry and Aggarwal 
suggest the section-scan of Figure D-,7(b). They then show that except for 
effects near the bottom line or at the junctions of two-sections, the 1-D 
difference equation from (D.IO) and*^ (D.13) looks shift-invariant. They then 
show that assuming this shift-invariance holds throughout the entire region, 
one obtains a 1-D stable filter, and one can overlap sections in order to 
reduce the errors at section junctions, leaving sisbstantial errors only at the 
far left and along the bottom. These errors notwithstanding, this method 
provides an extremely promising method for using 1-D filter design techniques 

to design filters to process 2 -d data. 

The "scan" order of Figure D.7(a) has been widely used in processing 
images via line by line scans [D-3,21,55,58] . Nahi [D-21,58] has used this 
to develop stochastic models for image processing, and the shift- variance in- 
troduced by doing 1-D processing on the scan-ordered data points causes errors 
along the bottom (we will have more to say about this in the next s^i)sectlon) . 
Mersereau and Dudgeon [D-3,55] point this out, noting that only periodic unit 
sample responses of the form h(m,n) = h(m+l,n-N) can be realized exactly by a 
1-D shift-invariant filter working on the scan-ordered data. They also spend 
a great deal of time studying this order when the data array is finite in both 
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(b) The "Section-Scan" Order of Manry-Aggarwal [D-56]. 
Figure D.7; Two Orders for 1-D Processing of 2-D Signals. 
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directions, such as xt xs for a 2-D fxnxte xmpulse response functxon. As they 
poxnt out, xn thxs case the Fourxer transform of the 1-D scan sxgnal xs a 
"slxce" of the 2 -d Fourxer" transform of the orxgxnal data array. Sxnce the order 
xs xnvertible, we xmraedxately see that we can conipletely recover the 2 -d trans- 
form from thxs slxce (whxch they term a crxtxcal slxce because of thxs property) . 
Consequently wxth the axd of 1-D desxgn methods, one can use thxs scan-orderxng 
for 2 -d fir fxlter design; given the Fourier transform of the ideal 2 -d 
filter, we take a critical slxce, hence obtaining an ideal 1-D filter; we use 
1-D design methods to determine an approximation to thxs ideal transfer function. 

4 

We then either use this 1-D filter to process the scanned data or we can invert 
the 1-D filter, regarded as a critical slxce, to find a 2-D filter which can 
operate directly on the 2-D array. For details we refer the reader to [D-3,55] . 
We also note that a closely related result involves the recovery of 2-D images 
from knowledge of 1-D projections of the array. Such a technique is of great 
interest in biomedical applications such as tomography, and we -refer the reader 
to [D-29] for a detailed survey of the theory and available algorithms related 
to thxs subject. 

We close our discussion of 2-D ordeis and precedence relations by noting 
that these very same issues arise naturally in certain feedback control problems. 
Ho and Chu lD-87,88] consider optimal control problems xn whxch one has a set 
of decision makers who base their decisions on certain observed data, which 
may be affected by the decisions of others. These decisions may be specified to 
be made at different points in time and/or by distinct decision makers at the 


One must be careful here to pad the 1-D finite impulse response and the scan sig- 
nal with zeroes. This is necessary because the extent of the convolution of two 
finite 2D arrays is larger than the original arrays. In order to be able to in- 
vert ’("unscan”) the convolved i-D signal to obtain the 2-D output, we must effec- 
tively scan enough zeroes at the end of each of the original 2-D arrays. See 
[D-55] for details. 
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same poults in tune. Ho and Chu define a precedence relation among decisiorP 


3 1 if the decision of j affects the observation of i (D.14) 

and they assume that this is a partial order — i.e, that if we cannot 

have (this is precisely the condition needed for recursibility of 2D 

filters). Then, under a "partially nested information condition" — if 3 *^ 1 , 

then 1 ' s observation includes knowledge of ] * s observation — they solve an 

5 

optimal control problem . When the partial order is a total order — i.e. when 
><is really ^ust time ordering, this is the usual optimal control problem. In 
the non-total order case, one can have simultaneous — i.e. incomparable — de- 
cision makers who do not affect each others observations. 

Witsenhausen [d- 93,94] has also studied this partial order and has raised 
issues analogous to those of Chan [D-107,152]. Witsenhausen points out that 
the amount of parallelism in the control system is essentially a measure of the 
number of incomparable decision makers (this number may vary with time) . In 
addition, if one totally orders the set of decision makers in a way compatible 
with (D.14) , one can then define the state evolution of the system. Hence we 
see that there may be memy possible sets of states corresponding to different 
compatible total orders. In fact, using a generalization of the Nerode notion 
of state, Witsenhausen shows that the set of possible states forms a lattice. 


When this condition is not satisfied, the problem" is more difficult — essentially 
information is forgotten. In this case Chu [D- 88 ] discusses some examples in 
which the optimal solution can be found with the aid of thd partially nested re- 
sult, and he discusses some suboptimal methods. We refer the reader to [D- 88 ] 
for details. 
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All of this IS developed with certain decentralized control problems (i.e. 
involving incomparable decision makers) in mind, and Witsenhausen points out 
that it IS not clear if the notion of state he has introduced will be of use 
in solving such problems (as it is in the classical totally ordered case) . He 
also mentions that the a priori partial order restriction does not hold in some 
game theory problems, in which the sequence of future decision makers can be 
affected by prior decisions. The difficulties here, as with those of nonrecur- 
sible 2-D filters, are quite substantial. 

An important problem in the study or design of 2-D recursive filters is 
stability, where, as in [D-3,16,49], we define stability as the absolute sum- 
mability of the unit impulse response 

4-oa 

|h(m,n)|<“ (D.15) 

m,n=-<» 

This condition is equivalent to bounded-input/bounded-output stability. As 

one might expect from knowledge of the 1-D case, the stability of a filter might 
depend on the direction of recursion — i.e. the equation 

y(ra+l,n) = 2y(m,n) + x(m+l,n) (D.16) 

IS unstable if recursedto the east, but it is stable 

1 1 

y(m,n) = jy(m+l,n) - ^x(m+l,n) (D.17) 

if we solve to the west. 
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Shanks, Treitel, and Justice [D-49] considered the stability of 2 -d 
systems with rational transfer fimctions as in (D.5) , (D.6), with b a ne 
function, as in (D.9). They also explicitly considered stability of the re- 
cursion in the NE direction only — i.e. we use (D.IO) to compute y(m,n) from 
inputs plus outputs to the SW. In this case, they obtained a direct analog of 
the 1-D stability result: 

A rational transfer function H(z ,z^) as in (D.5) 
with b a NE function is recursively stable in the 
NE direction if and only if no zero of the deno- 
minator B(z^,Z 2 > lies in the region 

{[z^|>i}n{lz2l>i} 

As in the 1-D case, we can use a NE b to define a SW recursion — instead of 
going from (D.9) to (L.IO), remove y(m^M,n-N) from the sum in (D.9); we can 
then recursively conpute this quantity using outputs to the NE. Similarly, 
we can pull out the other two "comer" elements to obtain NW and SE recursions. 
Hence we have 4 possibilities as opposed to the 2 in 1-D, (D.16), (D,17). As 
in the 1-D case, we obtain different stability conditions for these four cases, 
which Huang [D-50] has derived. For exanple, for the SW recursion, we have 
stability if and only if no zeroes of B(z^,z^) lie in 

Huang showed that at most one of the four directions of recursion can lead to 
a stable filter. In addition. Justice and Shanks [D-16] extended these ideas 
to recursions in different directions for N-D filters in which B does not 
necessarily have to have finite degree in z^,...,z^ and z^^,...,z^^. We refer 
the interested reader to [D-16] for a detailed statement and proof of these 


results 
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We now turn our attentxon to the problem of checking conditions such as 
(D,18) , As mentioned in [D-3 ] , the problem is complicated by the fact that 
the zeroes of B(z -z ) are not isolated points, but rather are surfaces. This 
makes the direct checking of (d. 18) quite difficult {one must map 
|z^j^l into the z^ -plane via the implicit relation we then have 

stability if and only if the image lies within Fortunately, a num- 

ber of sinplifications of the criterion (D.18) have been made. Huang [D-SOJ 
showed that (D.18) holds if and only if 


B(z^,")=!^0 


B(z^,Z2>^0 

H 

II 

H 


(D.19) 

(D,20) 


A generalization of this type of criterion to N-dimensions has been made by 
Anderson and Jury [D-45] . 

Let us consider the computations involved in (D.19) , (D.20) , First we 

note that the test of condition (D.19) is essentially a 1-D stability test, 

since B(z^,«>) is a polynomial in z^^. On the surface, however, it appears 

■ that (D.20) requires an infinite amount of computation (again we must map 

|z 1=1 into the z -plane via B(z ,z )=0). Fortunately, there are several 

finite algorithms for testing for conditions such as (D.20) , Huang himself 

used a 2 -d bilinear transformation to modi^ condition (D.20) in such a manner 

6 

that the continuous 2 -d parameter results of Ansell [D-64] could be used . 
Ansell's test consisted of a Hermite test which checks for the positivity of 
the principal minors of &' symmetric matrix of polynomials in one variable 


0 

Two variable system functions arise in a variety of problems. We will return 
to investigate the connections among these problems at- a later point in our 
discussion. 
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(this IS a posxtxve-real type of test) . The positivxty tests in turn can be 
performed using Sturm tests (we refer the reader to [D-50,54,64] for details). 

Anderson and Jiiry [D-54] suggested another method for checking (D.20) , 
Instead of using the bilinear transform plus the Hermite test, one can work 
directly with condition (D.20), using a Schur-Cohn test (see [D— 125]) that 
replaces (D.20) with a check for the positivity of all of the minors of a 
certain Hermitian matrix of polynomials m one variable. Again one can use 
Sturm tests on the individual minors. An alternative to this approach was 
proposed by Maria and Fahiiry [D-130] who used a modified version of the Jury 
table [D-125] to obtain a finite check of (D,20), 

Recently, an algorithm, far simpler than these and also better suited for 
computer implementation, was developed by Sil^ak [D-27] . The key to this 
algorithm is a powerful result [D-122] on the positivity of polynomial matrices. 
This result, developed with the applications of multivariable positive real 
functions to networks in mind (see, for example, [D-131]), replaces the sequence 
of tests of positivity of principal minors with two tests, independent of the 
dimension of the matrix. Specifically, one need only test for positivity of 
the matrix at a single value of the independent variable and for the positivity 
of the determinant. We refer the reader to [D-27] for details and for further 
remarks on the relationship of these stability results to multivariable tech- 
niques arising in network synthesis. 

We also note that a great deal of work has been done on extending tests 
for stability and positivity to the N-D case. Anderson and Jury [D-45] ex- 
tended their use of the Schur-Cohn test to higher dimensions> but did not 
directly propose a finite algorithm for the positivity tests one must perform 
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on polynomxals in (N-1) variables (which arise as principal minors from the 
Schur-Cohn test) . Bose and Jury [D-57] developed such an algorithm in the 3-D 
case, in which the 2-D positivity tests reduce to tests for sign variations 
of single-variable polynomials defined on the unit circle in the complex pleine. 
They also develop an extremely efficient method for computing multidimensional 
bilinear transformations, which allows them to develop a stability test algo- 
rithm for 3-D continuous systems. Subsequently and with the aid of results 
from decision algebra, Bose and Kamat [D-124] devised an algorithm for imple- 
menting Jury table calculations for an N-D stability test which involves a 
finite number of multivariable polynomial multiplications and a finite number 
of single variable polynomial factorizations (a nontrivial numerical problem) . 

In addition, Bose [D-147,148] and Bose and Modarressi Ed- 118,150] have used 
concepts from Jury’s theory of inners [D-123] for developing tests for positivity, 
nonnegativity, and greatest common factors of multivari^le polynomials. Such 
tests are needed not only in multivariable stability and positivity tests, but 

also find applications in applying Lyapunov's direct method to test for the 

7 

stability of multi-state-variable, 1-D systems. We refer the reader to the 
references. 

The issue of stability is clearly of great importance in filter design, 
but, as Mersereau and Dudgeon [D-3] point out, it is note enough to have a 
stability test. Rather, one wants a procedure for taking given frequency 
response characteristics and generating stable, recursive filters that possess 
these characteristics. One approach is to take a given transfer function and 


This leads to the question of extending Lyapunov methods to systems with more than 
1-D time. To do this, requires the notion of "stat^ for such systems. We shall 
discuss this problem at length later in this section. 
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to stabilxze it by finding a stable system function that has the same magnitude 

function for its frequency response. In 1-D, this process can easily be 

accomplished by replacing poles outside the unit circle with poles at conjugate 

reciprocal locations. An algebraic approach to this problem does not work in 

0 

2-D, since in general we cannot factor 2-D polynomials. However, another 1-D 
approach to performing this stabilization, involving the use of the discrete 
Hilbert transform techniques, has been extended to the 2-D case by Read and 
Treitel [D-53] . In this approach, one takes the denominator polynomial of a 
given rational response and -calculates its log-magnitude function. Then the 
2-D discrete Hilbert transform can be used to determine the minimum phase 
function associated with the log-magnitude function. One can then exponen- 
tiate to obtain the desired stable denominator. As Mersereau and Dudgeon [D-33 
point out, one of the difficulties with this method is that the resulting 
denominator need not be of finite order. Read and Treitel point out that this 
also can be traced to the lack of a fundamental theorem of alg^ra. 

Another approach to stabilization is to use spectral factorization to 
break a given system function into the product of several pieces, each of 
which IS stable with respect to a different direction of recursion. In 1-D, 
the fundamental theorem of algebra allows us to write any rational H(z) as 

H(z) = H„(z)H (z) (D.21) 

E w 

where has all its poles inside the unit circle (and hence is stable if 
used to process inputs in the eastern direction) and H^ has all its poles 
outside the unit circle (stable to the west). Thus, in 2-D, one is tenpted 

g ' ‘ - 

This IS often referred to as the "absence of the fundamental theorem of algebra' 
for multivariable polynomials (see, for example, [D-3] ) . 
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to seek one of several such factorxzations. All of these involve the use of 

2-D cepstral analysis to perfojrm the factorization, and we refer the reader 

to [D-33,42,100,119] for results on the existence and properties of 2-D 

cepstra. Following [D-lOO] , let us recall just a few of these properties. 

Given a 2-D signal s(iti,n) and its transform ^ the complex cepstrum 

(if it exists) s(m,n) is the inverse transform of lnlS(z^,Z 2 ) ] . Thus, if we 

are given a rational system function }J(.z^,z^) and wish to break it up into the 

9 

cascade of four stable quadrant filters [D-42] 


H(Zi,Z2) - 


(D.22) 


or two stable half-plane filters (using Dudgeon’s one-sided functions) [D-5,33] 


H(Zi,Z2) = (D.23) 

y\ 

this can be accomplished by additively deconposing h(m,n) into the "corresponding 
pieces." Thus, we will in principle have developed the desired spectral 
factorization algorithm once we determine the properties of cepstra of signals 
that are "minimiam phase", where we define minimum phase in analogy with 1-D, 
and we follow [D-lOO] . Specifically s(m,n) is minimum phase with respect to a 
given qpiadrant (NE,NW,SW,SE) or half plane (B,W) if the signal and its inverse 
s(m,n) (under convolution) are zero outside the given sector and if s(m,n) and 
s(m,n) are the impulse responses of stable filters that are recursively imple- 
mented in the direction associated with the given sector. Examining (D.22) , 

q ' 

Since It IS only the denominator of ’a{z^,z^) = A(z^,z^)/B(z^,z^) that matters 

as far as direction of recursibiliiy and stability, one often considers applying 
this procedure to 1 /B(z^,Z 2 ) 
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(D.23) , the factorizatxons of interest have the property that the impulse 
response for each piece (e.g. h^(n,m)< — > ^ minimum phase 

(H^(z^,z^) has no poles or zeroes in {|z^|^l} (^{jz^I^l} and h^^(n,ra). 


h^(n,m) 


are NE quadrant signals) . 


One then obtains the desired algorithm by using the following important 
property [D-lOO] : 


A signal is minimum phase with respect to a given 
sector if and only if its 2 -d cepstrum is zero 
outside this sector. 

Using this property, we can derive the 4 piece spectral factorization of 
Pistor [D-42] or the 2 piece factorization of Dudgeon via the following 

/\ aa 

algorithm; Given h(m,n), calculate h(m,n) . Consider the restrictions of h to 
the various (4 or 2) sectors of interest, for example 


A A ^ A /\ 

h(m,.n) - h.„(m,n) + ^ h„(iti,n) (D.24) 

NEi NW SW 

The desired spectral factors are the complex e3q>onentials of the transforms 
of these restrictions. This, in principle, solves the spectral factorization 
problem, but unfort’onately the fundamental theorem of algebra gets in the way 
again. Unlike the 1-D case, the factors in (D.22) and (D.23) need not be 
ratios of finite order polynomials. Hence each piece, in principle requires 
an infinite amount of storage (e.g. for a NE filter we must keep all data 


points to the SW) . Approximations are clearly needed, and we refer the reader 
to [D-33, 42] for details. 
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An excellent treatment of the use of cepstra for spectral factoriza- 
txon IS gxven xn Ekstrom and Woods [D-119] . In addxtxon to consxderxng 
the 2 and 4 factor cases, they consxder an 8 factor case — 4 factors 
corresponding to signals that are strxctly xn the 4 quadrants (x,e. they are 
zero along the coordxnate axes) plus 4 factors for the 4 pxeces of the axes 
(m=0 and n^O, m=0 and n<0, m>0 and n=0, m^O and n=0) . These last 4 pxeces 
correspond to the separable part of the system functxon, xs separable 

xf and only xf xt xs of the form (z^) ) whxle the other 4 pxeces can be 

vxewed as "totally non-separable , " Using this factorization applied to a NE 
quadrant filter, Ekstrom and Woods obtain an interesting interpretation of the 
two conditions of Huang's [D-50] stability test (D.19), (D.20) , Essentially 
we factor our system function as follows 




(D.25) 


where "SNE" means "strictly NE". Then (D.19) corresponds to b^(m) being 
minimum phase, while (D,20) implies that ^2 *^SNE^^l'^2^ ^ minimum phase. 

Ekstrom and Woods also discuss the likelihood that the factors are not of 
finite order, and, in fact, if one factorization has finite order factors, this 
does not imply that either of the other two factorizations do. They also 
discuss the numerical calculation of cepstra and of the spectral factors, and 
they in fact propose this as an algorithm to test for stability (b satisfies 
(D.19), (D.20) if and only if b is a NE quadrant function). Such a procedure 
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in principle requires an infinite amount of con^iutation (we must check b=0 
over all three other quadrants ) , but one can obtain a fast approximate test 
by looking over a restricted part of the plane. 

The question of stable filter design to approximate a given frecjuency 
magnitude-response function is considered in [D-119] , They point out that 
to do this one needs two one quadrant filters (NE and NW, NW and SW, SW and SE, 
or SE and NE) , but one can make the approximation with a single half-plane 
filter. Ekstrom and Woods also consider the finite order approximation of 
the infinite degree rational functions that arise as factors in the spectral 
factorization. Intuitively, one wants to window the denominator power series 
to obtain a finite order series that remains stable. In [D-119] it is shown 
that stability is preserved if one uses an exponential window. 

In closing, let us note that in Section B we saw that one could devise 
state space stochastic realization procedures to perform the desired spectral 
factorization. As one might expect, m 2-D there are some difficulties with 
this type of procedure, but some results ^ exist. We will talk about this 
fTirther when we discuss 2 -d state space methods. 

A final stabilization procedure is based on the guaranteed stability in 
1-D of least squares inverses. The least squares inverse (LSI) is obtained 
using exactly the methodology one brings into play in performing linear prediction 
of speech (see Section B) . Given the denominator B and its inverse transform 
b, one seeks a finite extent impulse response p that approximates the 

2X) ' ' — — 

Also, as discussed in [D-49] , m order to implement a zero-phase filter by means 

of causal, recursive filters, one needs 4 identical quadrant filters — one for 

each direction — or 2 identical half-plane filters. This is the analog of the 

1-D result,^ in which we realize a zero-phase filter as the cascade of a given 

filter, followed by an identical filter going backward in time. 
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convolutional xnverse of b. One then seeks to choose the coefficients in p 
to minimize the sum of the squares of the difference between b*p and the unit 
iitjjulse. In 1-D, this leads to the fast algorithms described in Section B, _in 
which one iterates on the extent of p. One also has the guarantee that p is 
minimum phase (i.e. that the all pole model 1/P is stable). In [D-49] Shanks, 
et.al., conjectured that this minimum phase property holds in 2 -d. Under this 
assumption, they proposed the use of a double least squares inverse to stabi- 
lize and unstable denominator. That is, given b, we calculate its LSI p, and 
we then calculate the LSI b of p. By conjecture this is minimum phase, and 

hopefully is a good approximation of least in magnitude 

on izj^l=|z 2 |=l) . Using this design procedure, numerous 2-D filters have been 
designed (see, for example, [D-49]). Unfortunately, Genin and Kamp [D-144,145] 
have recently shown that this conjecture is false in 2-D. Not only does this 
make suspect the aforementioned design procedure, but it also makes more dif- 
ficult the extension of linear prediction concepts to 2-D. We will have more 
to say about this in the next subsection. Suffice it for us to note here that 
imlike the 1-D case [B-26], in the 2-D case the linear prediction solution does 
not match the first few correlation coefficients [D-66,67,1S6] . 

Let us make a few final comments concerning 2-D design and structures 
questions. Again, one finds that certain 1-D concepts and techniques do extend, 
while others do not. One of the earliest design methods proposed was by 
Treitel and Shanks [D-48] , in which they suggested approximating a desired 
impulse response h (m,n) as a sum of separable terms 
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N 

h(m,n) = {m)g (n) (D.26) 

x=l ^ 

If h IS of limted extent one can xn prxncxple do thxs exactly, by vxewxng h 
as a matrxx and then fxndxng xts spectral representatxon. In general thxs leads 
to no effxcxencxes xn iraplementatxon unless N xs s\ibstantxally less than the 
extent of h. Trextel and Shanks suggest a method for truncatxng (D.26), es- 
sentxally keepxng only the domxnant terms, correspondxng to the largest exgen- 
values of h*h, and they perform an error analysxs for such approxxmate fxlters. 
Also havxng a decomposxtxon such as (D.26) suggests several interesting struc- 
tures. The summation can clearly be realized via a parallel arrangement of 
the various separable terms, and each separable term xs a cascade of two 1-D 
FIR filters — one operating vertically, the other horizontally. Thus, each 
of these can be implemented with an FFT, or, one might approximate each -1-D 
filter by a recursive filter which can be implemented even more efficiently. 

Thus we she that the separable and sum of separable cases can be handled 
— essentlctlly with 1-D techniques. We will see .later that such cases have 
special implications in the state space framework. 

Motivated by a similar desire to use 1-D design methods for 2-D problems. 
Shanks, et.al., [D-49'] considered taking a 1-D continuous time filter F(s) , 
which can be viwed as either a horizontal or vertical 2-D filter, and rotating 
it by an angle B 

F ( 31 , 82 ) = F(SiCos3+S2Sing) 


(D.27) 
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thus obtaining a 1-D filter that processes data along lines at an angle 3 
with the s^ axis. One can then apply the 2-D bilinear transformation to 
obtain a 2 -d digital filter design. Several exciitples of such "rotated 
designs" are given in [D-49] . in addition, Costa and Venetsanopoulos [D-51] 
have considered this design technique in more detail^. They note that since 
F IS 1-D, it factors, and thus the stability test for the final 2-D filter 
can be reduced to very simple tests on the factors. They find that for given 
directions of recursion, there are constraints on the angle 3 for which the 
resulting filter is stable. Of course, other angles are possible if one 
rotates the data or changes the direction of recursion. In addition, they 
consider the design of filters with circular symmetry, obtained by cascading 
identical 1-D filters that have been rotated to be spaced evenly between 0* 
and 360°, Such designs have the advantages of guaranteed stability, efficient 
computer design, and cascade implementation due to the factorizability of the 
1-D protot3^e filter. 

The use of transformations to take 1-D designs into 2 -d designs is a 
- conceptually appealing idea. In addition to the methods mentioned above and 
the 1-D projections of Mersereau and Dudgeon [D-55] and Manry and Aggarwal 
[D-56] discussed earlier, several other methods have been devised for utilizing 
1-D filter designs. One of the most powerful methods of this type for 
designing 2-D FIR filters involves the so-called McClellan transformations 
[D-2,36,127,128,129] . The original algorithm as developed in [D-2,36] involves 
transforming a 1-D filter of the form 

M 

G(e^^) = ^ b(n)cosnti) 

n=0 


(D.28) 
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into a 2 -d linear phase filter 


H(e ,e ) 


exp{-j 


(D.29) 


where 


A DW 30_ 1 2 

H(e ■^,e ) = 2 X) a(k,p)cosku^cospW 2 (D.30) 

k=0 p=0 

The specification of (D.30) is obtained from (D.28) by means of the transfor- 
mation 

cosu = AcosO)^ + BcosCO^ + CcosW^cosW^ + D (D.31) 

where choices of A,B,C,D determine the shape of contours where w=constant. 
Clearly on such contours jllj is constant. For example, the choice A-B=C=-D=l/2 
yields nearly circular contours, and hence one can map a low pass filter G into 
a low pass circularly- symmetric filter H. Thus, one can use 1-D FIR techniques 
to design 2-D FIR filters of high order in a reasonably efficient manner. In 
some cases, one can in fact show that transformations of 1-D optimal fitlers 
(in the Chebyshev sense — i.e. minimizing maximum deviation from a desired 
frequency response) are in fact the optimal 2-D designs [D-36] . In [D-128] an 
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extension of this design criterion was considered, in which (D.31) was replaced 
by 


cosu = 



Q D“2 

t(p,q)cospOJ cosqU„ = H (e ,e ) 

^0 1 2 p 


(D.32) 


By careful choice of the parameters t(p,q), one can obtain a variety of 
contour shapes in the 2-D frequency plane, and [D-128]‘contains details of 
algorithms for choosing these parameters to obtain best approximations to 
given desired contours. Having chosen the contours, the second part of the 
design procedure involves the design of the 1-D FIR filter, for which there 
are numerous procedures [D-2] , 

One of the nice features of these transformation designs is that they 
lead directly to efficient structures. The development of these structures 
and a study of their relative merits based on number of multiplies, coefficient 
sensitivity, and roundoff noise is given in [D-127,129]. We briefly illustrate 
the idea by following the development of Chan and McClellan in [D-127] . 
Examining (D.28) , we note that 

cosnto = T [costiJ] (D.33) 

" I 

where T is the nth Chebyshev polynomial, which satisfies the recursion 
n s 


Tq (x) =1 , (x) = X 

T (x) = 2x T . (x) - T „ (x) 
n n-1 n-2 


(D.34) 
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Rewriting (D.28) as 

M 

G(e^ ) = ^ b{n)T [cosu] (D.35) 

n=o 

and replacing cosw by (D.32 ) , we can directly obtain a realization of 

^ 3^2 “ ' 3^2 

H(e ,e ) as an interconnection of M copies of ,e ), where one 

uses the recursion (D.34) in interconnecting the copies of H to obtain rea- 

3^1 3^2 ^ 

lizations of each of the T [H (e ,e )]. We refer the reader to [D-127] 

n p 

for details. 

Another 2 -d design method adapted from 1-D was proposed by Shanks, et.al., 
[D-49 ] , who modified the time-domain design technique of Burrus and Parks 
[B-114] . As in the l-D case, given a desired impulse response h(m,n), we 
want to find a rational transfer fiinction A(z ,z )/B(z ,z ) that yields an 

X ^ X M 

impulse response "close" to h. We first solve for the denominator B by a 

\ 

method quite similar to that use in computing the least-squares inverse (and 

which evidently will have the same stability problems as those mentioned 

earlier) , One can then solve for the numerator using the analog of the method 

described in [B-1143 and in Section B. 

One of the most widely used FIR design methods in 1-D is the optimum 
I 

Chebyshev design method, where the Remez exchange algorithm leads to an 
extremely efficient computer design technique [D-2] . K^p and Thiran [D-74] 
have extended this algorithm to 2-D, but not without a number of severe com- 
plications. Firstly, the Haar condition does not. hold in the 2-D case and this 
can lead to degeneracies that can keep the algorithm from converging. Also, 
unlike the ordered 1-D case in which one can show [D-2] that errors between 
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the optimal design and the desired response alternate between + the maximum 
error (the Chebyshev norm ) , in the 2-D case one has no such alternation 
theorem. This makes the exchange algorithm far more complex, and this plus 
several other factors make the algorithm extremely slow. Hence it is limited 
to low order impulse responses. We refer the reader- to [D-74] for details. 

In the 1-D case, one can use the so-called' differential correction 
method to find optimal Chebyshev rational frequency responses [D-S2] and this 
method has been extended by Bednar [D-13] to^ the 2-D case. As pointed out in 
[D-3] , this method requires a great deal of computation time, and also the 
algorithm produces as its output an optimal rational magnitude-squared frequency 
response. Thus, to obtain the actual filter specification, one must perform 
a spectral factorization, which, as we have seen, leads in general to an 
infinite order numerator and denominator. 

In addition to the design methods mentioned above, a mmiber of other 
methods have been proposed. These include windowing [D-13 3] , frequency sampling 
[D-134] , transformations of (z ,z ) to obtain new designs from old [D-149] , 
and the extension of wave digital filters [D-lSi] to 2 -d, with all of the 
pseudopassivity and stability properties of their 1-D counterparts. We refer 
the reader to these references for details. 

The issue of 2— D filter structures and of their effects on required storage, 
number of multiplies, coefficient sensitivity, and roundoff noise has been 
raised several times in this section and is clearly of great importance. The 
issue IS complicated significantly by the fact that one cannot factor general 
2-D polynomials. This immediately rules out cascade and parallel realizations 
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unless one is dealing with one of the special classes of filters described 
earlier. Mitra, et.a. [D-26] show, however that one can write down the ge- 
neralizations of i-D direct form realizations for NE recursions. They also 
comment as we did earlier on the dependence of storage requirements not only 
on the order of the filter but also on the output array dimensions. In ad- 
dition, for several special classes of NE rational filters, they developed 
structures based on continued fraction expansions. We refer the reader to 
[D-26] for details. 

As m the 1-D case, a critical question in the design of 2-D HR filters 
is the existence of limit cycles and the effect of roundoff noise filter output. 
Maria and Fahny [d- 28,73] have considered the limit cycle problem for first. order 
2 -d recursive filters, both singly [D-73] and in cascade [D-28] . The results 
in [D-28] on the existence of horizontal, vertical, and noninteracting diagonal 
limit cycles parallel the results of Jackson [A-20] quite closely, and their 
method for bounding the magnitude of limit cycles is quite similar to the 1-D 
result of Sandberg and .Kaiser [A-4] , although the bounds become far more complex 
as one looks at limit cycles on rows or columns other than the first ones. 

. Open questions involve the extension of this type of result to higher order 
filters. In addition, an intriguing question is whether one can extend any of 
the other techniques discussed in Section A. Do the passivity-Tsypskin-positive 
real-frequency domain results of Claasen, et.al., [A-15] and others extend to 
the 2 -d case? What about the Lyapunov techniques of Willson [A-2]? Of course 
in this case one would need 2 -d state space models and' a 2-D Lyapunov theory. 
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The analysis of roundoff noise in 2 -d filters can be carried out mich as 
for 1-D filters, and we refer the reader to the references for examples of this 
type of analysis. Another open question concerns the extension of the Lyapxmov 
equation-state covariance noise analysis method described in Section C for 1-D 
roundoff analysis. Again one would need a state space model in order to consider 
this question. We will come back to this question in a moment. 

Finally, we note that Chan [D-1073 has proposed a unified state space 
framework for the study of 1-D and 2 -d structures. In Section C we discussed 
the 1-D aspects of this approach, in which all structures can be viewed as fac- 
torizations of the map that transforms the present state and next input into 
the next state and output. In the 2-D case, one must process inputs sequentially 
according to any order function that is compatible with the recursion precedence 
relation (D.ll), Also, as we have noted before, the resulting 1-D state space 
realization is finite dimensional if ^ and only if the data is defined on a domain 
that is bounded in one direction. Using the scan order described earlier, 

Chan develops a time-varying state realization. The time-variations arise for 
precisely the reason mentioned earlier— we must take account of the edge effects 
as we finish scanning one line and begin scanning at the start of the next. 

Chan develops a realization using tixe scan order for a general NE recursive filter. 
He conjectures that this realization is minimal in the recursive case, but 
shows that it is not in the FIR case. On the other hand, in the FIR case, we 
have mentioned earlier that one can realize the 2-D filter with the scan order 
and a time-invari ant 1-D filter by padding the ends of each line with zeroes 
(this is essentially what Mersereau and Dudgeon did in [D-55] ) . Chan show that 
he can do the same in his setting by finding a nonminimal, (caused by padded 
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zeroes) txme-invariant realization. This leads to an interesting tradeoff — 
nonminimality of one realization versus the more complex control needed in 
order to implement the time-varying minimal one. The utility of such 1-D state 
space models and the additional degree of freedom one has in choosing the order 
relation (and hence the state space as Witsenhausen [D-93,941 pointed out) 
makes this an interesting area for further research. 

In addition to the above 1-D state space descriptions for recursively 
ordered 2-D systems, some work has been done in the past few years involving 
the definition and analysis of 2 -d state space models. Roesser [D-110] 
considers HE models of the form 

v(i+l, 3 ) = A^vCi,:]) + A^^Ci,]) + B^x(i,]) 

h(i,]+l) = + B2x(i,3) (D.36) 

y(i, 3 ) = C^v(i, 3 ) + + Dx(l, 3 ) 

here x is the input, y is the output, and v and h together play the role of a 
"state" variable. Here v carries information vertically, and h conveys it 
horizontally. In addition, Roesser takes (D.36) to be a NE recursion ( 1 , 3 ^ 0 ). 

Given this model, Roesser considers several issues. He solves (D.36), 
and the solution resembles the variation of constants formula for usual finite- 
dimensional 1— D linear systems. The one main difference is that boundary 
conditions v(0,]), 3^0 and h(i,0), i^O must be specified. Roesser also con- 
siders a 2-D version of the Cayley-Hamilton Theorem. Taking the 2 -d transform 
of (D.36), we obtain 
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f 

and hence in this setting the role the characteristic polynomial is played by 


p(Zl»z^) = det 




-a. 


-a_ 




Let 


A = 






where 



(D.38) 


(D.39) 


represent the required dynamics to advance the system in the vertical and 
horizontal directions, respectively. We can then define the transition matrix 
over a number of vertical and horizontal steps 
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0,0 

A ' - 1 




Then, xf we defxne 


(D.40) 


E^p\ = F^E^A = A^'^ * (D.41) 

we have the 2-D Cayley Hamilton theorem 

p(E,P)A = 0 (P.42) 

Roesser uses thxs result to obtain an efficient method for computing the 
transition matrix. The result is also used to obtain finite rank tests as 
in the 1~D case for controllability and observability, which are defined in 
analogy with 1-D,. Specifically a state (v,h) is observable if whenever it 
appears as the initial state at (0,0) , with all other boundary conditions 
zero, the resulting output y(i,D), i, 3>0 is not identically zero" when all ^ 
zero inputs are applied. The state is controllable if there is some 
(i,])>^(0,0) and set of inputs so that (v(i, 3 ) ,h (i,]) ) = (v,h) when the boundary 
conditions are all zero. 

Several questions and issues arise in considering Roesser' s model. First 
of all, not all NE quadrant rational transfer functions can be realized by 
systems of the form (D.36), although this can be remedied by a mo'dification of 
the output equations [D-164] , we' refer ther readeir to [D-164] for more on 
realization theory and canonical forms for these systems. Also, in obtaining 
his algorithm for recursively computing the A^'^ via the Cayley-Hamilton theorem, 
Roesser used the notion of 2 -d eigenvalues in a crucial manner, and in the usual 
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non-factorxzat>le case the calculatxon of zeroes of p(z^fZ^) xs extremely 
dxffxcult. Thxs xs not only complxcates hxs transitxon matrxx algorxthm, but 
xt makes stabxlxl^ tests more dxffxcult. One must use methods such as Sxlgak's 
{D-27] on p(z^r^2^ dxrect extensxon of Huang's stabilxty test to the 

model (D.36) (see [D-164] ) , An xnterestxng open guestxon xs the development of 
Lyapunov stabxlxty methods for (D. 36 ) . Furthermore , the model (D. 36 ) xs Ix- 
mxted to quadrant-causal systems. Thxs xs perfectly reasonable for the study 
of quadrant-recursxve fxlters, but xts value for the analysxs of other 2-D 
sxgnals is unclear. For example, Roesser mentxons the possibxlxty of a 2-D 
fxlterxng theory based on (D.36). in thxs case, one would want to model the 
observed sxgnal z as 

z(x,g) = y(x,g) + N(x,g) (D.43) 

where N xs noxse, and y xs generated by a model as in (D.36) wxth x a noxse 
process. Thus (D.36) plays the role of a "spatial shapxng fxlter." As 
Ekstrom and Woods [D-119] poxnt out, one cannot obtaxn arbxtrary spectra from 
a NE shapxng filter. -Hence, one may need two such fxlters, as well as a 
method for modellxng the spectra of the signal fxeld. Also, the artxfxcxally 
xmposed causality of the model (D.36) and xn fact of any state space model 
may cause dxffxcultxes. For example, xn an xmage one would not expect Ixght 
xntensxty as a functxon of spatxal locatxon to have a NE causal structure. 

On the other hand, xf a NE causal fxlter yxelds the proper shape for the xn- 
tensxty correlatxon functxon, there may be no dxfficulty in using such a model. 
Indeed, as Andrews and Hunt [D-81] point out, the use of such models may be 
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of value in leading to efficient recursive filtering methods for image processing. 
This remains an open area for further research, and we will have more to say 
about It in the next subsection. 

Finally, we note that Roesser's "state" (v(i, 3 ) ,h(i, 3 ) ) might better be 
termed a "local state" [D-97,138] . As we saw earlier, in recursively solving 
2 — D equations, the required amount of storage in general depends on the size of 
the arrays of interest (see Figures D.3 and D. 6 ) . Hence if the array sizes are 
unbounded, the required memory is infinite. Thus, v and h in Roesser's model 
do not represent the true state. Rather the model (Di36) can be viewed as 
arising by reducing a scalar, high order 2 -d difference equation to a vector, 
first-order equation. In this way, we see that the dimensions of v and h 
correspond to the order of the equations of interest. 

Issues of this type have been considered in more depth by Pomasini and 
Marchesini [D-97,138], They consider impulse responses that lie strictly in 
the NE quadrant, and for such systems they define a notion of "global state" 
using a direct generalization of the theory of Nerode; In order to define the 
global state as containing all relevant infbrmation concerning "past" inputs, 
one needs to define "past," The definition of past inputs at the point ( 1 , 3 ) 

I's all x(k,^-) where either k<i or ]<£ (see Figure D. 8 ), In this way the state 
must summarize all needed boundary conditions, and Fomasini and fiarchesini 
point out that the state is usually infinite dimensional. 

Attention in [D-97] then shifts to local NE state space descriptions of 


the form 
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x(m+l,n+l) = AQx(ra,n) + A^x(m+l,n) + A2x(m,n+1) + Bu{m,n) ^ (D.44) 

y-(m,n) = Cx(m,n) 

Note here that vertical and horizontal information is conveyed by a single 
state vector. Having this model, it is then shown that a NE HR filter can 
be realized as in (D.44) if and only if the transform of the impulse response 
IS rational. The "if" part of this result involves a procedure for construc- 
ting a realization in a form that is some type of generalization of the 1-D 
"standard controllable form". 

Having such realizations, attention naturally focusses on minimality — 

obtaining a local state space model (D.44) with as small a state space as 

possible. This leads directly to the notions of controllability and observa- 

> 

bility, with finite rank conditions for these properties being developed in a 
manner analogous to that of Eoesser. In fact, a simple proof of the 2-D Cayley- 
Hamilton result is given in [D-97] for systems as in (D.44) . The main mini- 
mality result of Marchesini and Fomasini is that minimality implies local 
controllability and observability (an algorithm for reducing the dimension of 
uncontrollable and/or unobservable realizations is given) but that local 
controllability and observability do not imply minimality. This is done by 
means of a counterexample that we will discuss shortly. 

It should be noted that the work in [D-97] is phrased in terms of the 
algebraic notion of formal power series" (essentially (D.3) with no convergence 
properties attached to it) , The most thorough treatments of the uses of this - 
theory to study topics in formal language theory, automata theory, nonlinear 
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systems analysis, and 2 -d processes are the works of Fliess- [D-98,139,140] , 
Fliess studies the properties of rational power series^^ in- detail using in 
part a generalization of the Hankel matrix, and he shows that the rank of this 
matrix equals the dimension of the minimal global state space. This is infinite 
dimensional, in general, but Fliess notes [D-98] that the global state space is 
finite dimensional if and only if the formal power series is "recognizable", 
which simply means that it has a separable denominator. As we have seen one 
can do a great deal of analysis for separable 2 -d systems, since many 1-D con- 
cepts and results directly extend in this case 

Attasi [d- 6,35,96] has studied such systems in great detail. His basic 
model IS a special case of (D.44) 

x(m+l,n+l) = F^x(m,n+1) + F^xta+l.n) - F^F 2 x(m,n) + Gu(m,n) (D.45) 

y(m,n) = Hsc{m,n) 

where it is assumed that 

F 1 F 2 = F^F^ (D.46) 

With these assiimptions , one finds that the impulse response is strictly NE, 
and It and its transform are given by 

h(i,D) = HF^'^F^"^G, i,3>0 (D.47) 

H(z^,Z 2) = H(z^l-Fj^)'^(Z2l-F2)”^G (D.48) 

V 

^^In general the indeterminates in this theory are taken to be noncommuting. 
However in the 2 -d case, the two shifts and Z 2 do commute. 
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Clearly any FIR filter can be realized as in (D.45) , and thus any stable 
impulse response can be approximated arbitrarily closely by a system of this 
form. This, of course, is neither startling nor necessarily very useful, since 
the dimension of the resulting state-space system may be extremely large. 

Having this framework, Attasi defines dual notions of local controllability 
cind observability and derives conditions somewhat simpler than in [D-97,110] 
because of the special nature of (D,45). Attasi also considers minimal reali- 
zations of the form of (D.45) , obtains a state space decomposition result and 
mnimal realization algorithm much like those in 1-D (here the 2-D Hankel matrix 
plays a crucial role) , and shows that minimality iit^lies controllability and 
observability. He also proves the converse of this last result, but this is 
only true if one looks for the minimal realization in the class of models given 
by (D.45) . Consider the example constructed by Pomasini and Marchesini [D-97] 




-1 - 1 ,, -1 - 1 , 


Zl Z^ (1+z^ ) 

(l+z“^) (l+z^^ ) 


(D.49) 


The minimal realization of the form of (D.45) is of dimension >3, but one can 

find a realization of the form (D.44) of dimension 2. This clearly points out 

another of the many complications that arises in going from 1-D to 2-D. 

Undoubtedly the major contribution of Attasi ''S work is that he did something 
12 

with his models . He was able to develop a 2 -d 'Lyapunov equation. More 


12 

That may very well be because this is the one case in which one can readily 
see what to do. 
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specrfxcally, to show northern and eastern asymptotic stability, we simply need 
to check the 1-D systems along vertical or horizontal lines. This leads to 1-D 
Lyapunov equations and nothing new. However, Attasi did obtain an "invariance 
principle" type of result (see Section A): if and stable, then (D.45) 

is controllable if and only if the equation 

* » 1 

P-P PF* - F PF' + P F PF'P' = GG (D.50) 

J.J. ^ 

has a unique positive definite solution P. The exact iirplication of this 
result for 2-D stability theory and its potential utility in such areas as 
limit cycle analysis are at present unclear cind remain intriguing questions 
for further work. 

Attasi also considers systems as in (D.45) which are driven by white 
noise. Again he obtains a 2 -d Lyapunov equation for the state covariance, 
and this result may be of some value in performing roundoff noise analysis 
for 2-D filters (see the analogous 1-D discussion in Section C) . Also, 

Attasi shows that any 2-D stationary covariance function can be approximated 
arbitrarily closely by a system of this type, and he develops a stochastic 
realization theory that exactly parallels the 1-D case with one rather surpri- 
sing exception. In the 1-D case, there are in general a whole family of 
stochastic realizations, each of which essentially factors the spectral density 
S(z) of the output process y. In the 2-D case, assuming that one can factor 
the spectrum of y, the stochastic realization is essentially unique. 

This IS due primarily to the additional constraints on S imposed by the fact 
that we use a single quadrant shaping filter (D.45) , Specifically, in addition 
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to the constraints irttposed by NE and SW correlations, an additional constraint 
arises in considering NW and SE correlations. This constraint leads to the 
uniqueness result. 

We note that this stochastic realization- spectral factorization result 
suffers from all of the numerical problems mentioned in Section B and from 
the difficulties of 2 -d factorization. The one novel feature of Attasi’s de- 
velopment IS the use and in fact the necessity for using non-square factors — 
i.e, to perform the required factorization 

S(z^,z^) = H(z^,Z 2)H» (z“^,z“^) (D.51) 

where H is NE causal and of the form {D.48) , one must consider rectangular 
13 

factors. For example, if y is a scalar process, then H in general must be 
Ixm, and, in fact, the aforementioned uniqueness result fixes the value of m. 
We refer the ireader to [D-6 ,35,96], 

We remark that the primary motivation for Attasi’s work was to develop a 
2 -d stochastic framework in which to study 2-D Kalman filtering and its appli- 
cation to image processing. Several other authors have consider such problems, 
and we will consider them in the next subsection. 

Recently, Morf, et.al., [D-162,163] have made several noteworthy contri- 
butions to 2-D state space theory. In [D-162] they consider the properties of 
polynomial and rational matrices in two variables. The motivation for this 
study, which leads naturally to multi-input, multi-output 2-D systems, is the 
generalization of the scalar 2-D polynomial results of Bose [D-147,l66] and the 
matrix 1-D polynomial results of Rosenbrock [D-168] and Wolovich [D-169] . 


Rectangular factors are considered in the general 1-D stochastic realization 
theory described in Section B, but they are not necessary in order to factor 1-D 
scalar spectra. 
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Morf, et.al,, generalize the scalar notion of primitive factorization to the 
matrix polynomial case, and they provide an existence and viniqueness proof 
for such a factorization. By regarding a 2 -d polynomial p(z^,z^) as a 1-D 
polynomial (say in z^) with coefficients that are rational functions in the 
other variable and by introducing several notions from algebraic geometry, 
they are able to use many 1-D techniques to obtain 2 -d generalizations of the 
Euclidean algorithm, Hermite and Smith forms, tests for relative coprimeness 
of polynomial matrices, matrix fraction descriptions of rational matrices, and 
the extraction of greatest common right divisors. In 1-D Rosenbrock and 
Wolovich utilize many of these properties to study multi-input, multi-output 
state space models. In [D-163] the results of [D-162] are used to study 
2 -d state space models. The models of Roesser, Fomasini-Marchesini , and 
Attasi are reviewed, and Morf, et.al., argue in favor of Roesser* s model. 

Their reasoning is that (D.36) is a true first order system, and hence v and h 
together comprise a valid local state. The model (D.44) , on the other hand is 
not first order, and hence x is not a local state — i.e. the order of the 
system (D.44) may be larger than the dimension of x. The importance of this 
is not totally clear, since, as we've seen, the required storage depends on 
more than the order of the system. 

The concepts of local controllability and observability for the Roesser 
model are eiqjlored in [D-163] , and the authors point out that these conditions 
neither inply or are inplied by the minimality of the realization (this is 
done with several instructive examples) . This difficulty can be partially 
overcome by redefining local controllability and observability for (D.36) by 
requiring these properties to hold separately in the horizontal and vertical 
directions (but not necessarily jointly) . With this definition, minimality 
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implxes but is not implied by local controllability and observability. 

To obtain notions of controllability and observability that are equivalent 
to minimality, Morf, et.al., generalize the approach of Rosenbrock in which 
coprimeness of polynomial matrices plays a crucial role. This leads to the 
notions of modal controllability and observability and a related concept of 
minimality and also allows one to use the algebraic and geometric concepts 
developed in [D-162] m order to study the 2 -d realization problem. In this 
setting the existence of minimal realizations becomes a difficult problem, 
and one may not even exist if we restrict ourselves to systems with recil para- 
meters (see [D-163] for an example). In related work, Sontag [D-l43,l54,E-29] 
has also found realizations of lower dimension than those proposed by 
Pomasini and Marchesini, and he has shown that minimal realizations need not 
be unique up to a change of basis. All of these facts indicate that the 2 -d 
state space model is an extremely complex one and offers some extremely difficult 
mathematical and conceptual problems. As with all other topics concerning 2 -d 
systems, there are many possible ways to generalize 1-D concepts. It remains 
to be seen whether any of these state models and realization theories can 
provide a useful framework for solving 2— D analysis and synthesis problems. 

A number of authors have considered state space and other dynamic models 
defined with very general independent variables. Motivated to a large degree 
by the partially-ordered feedback structures of Ho and Chu [D-87,88] and 
Witsenhausen [D-93,94] , Mullans and Elliott [D-95] and Wyman (D- 143, 160, 161] 
have considered the development of an algebraic state space theory on partially 
ordered sets. In addition, Seviora and Sablatash [D-114-116] have placed 
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algebraic (specifically, abelian group) structures on the independent variable 
in order to consider a generalized transform and digital filter theory with 
the aid of tools from the theory of abstract hamonic analysis. Their framework 
IS quite abstract and general, and it includes such possible time sets as the 
integers, the usual 2-D plane of integer pairs, and a variety of 'bylindrical 
time sets." We will have occasion to use such a "time set" in the next section. 

Finally, we have noted at several points that the issues arising in the 
analysis of 2-D discrete time systems have many similarities with results in 
other areas. For example, Ansell [D-64] and Youla [D-77] studied continuous- 
time transfer functions in two variables that arise in the consideration of 
networks containing lumped and distributed elements. Along similar lines, 

Kamen [E-30] has developed an algebraic theory for considering continuous-time 
systems that contain time delays. In addition, as mentioned earlier, Sontag 
[D-143,154,E-29] has considered a general algebraic framework of this type and 
has tied together some of the time delay and 2 -d results. 

Other classes of systems have also been analyzed in a similar manner, 

Kamen [D-142] has developed a theory for time-varying 1-D systems that bears 

some resemblance to the 2-D theory. Also, Pliess [D-98,139,140] , Fornasini 
and Marchesini [D-138,E-36] , and Bush [D-155] have noted and have taken 
advantage of some of the rather striking relationships among certain nonlinear 
and 2 -d system results. To illustrate the basic idea, consider the following 
three systems: 

Vo 1 terra (single input) 


y(iu) = h(m-k,m-A)x(k)x{Jl) 

k,it 


(D.52) 
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Bxlinear (two inputs) 

y(m) = h(m-k,mr-£)x (k)x (W 

k,il 

Two Dimensional (single input) 

y(in,n) = ^ h(m-k,n-il)x(k,i!,) 

kf^' 


(D.53) 


(D.54) 


One iiranediately sees the striking relationship among these three classes of 
systems, and it is not surprising that similar methods of analysis can be 
used on all of them. Indeed, Fliess' formal power series formulation leads 
directly to a methodology for analyzing algebraic properties of each kind of 
system. Also, Fomasini and Marchesini were led to the study of 2-D systems 
by their earlier results on bilinear systems. Finally, we note that in his 
work on bilinear systems Bush considered the 2 -d transform of the 

weighting function h that appears in (D.53) . He showed that if one could 
write 




'^2 > 


<7 {z ^)d (Z ^)CT {z ^Z 
1 ^2 2 ^3 1 2 


(D.55) 


where p is a two-variable polynomial and the q are polynomials in a single 

1 

variable, then the system could be realized by three finite dimensional 
linear systems and a single multiplier. Again the fundamental theorem of 
algebra makes it difficult to find representations as in (D.55) (a condition 
slightly weaker than separability) . We refer the reader to the references 


for details of these ideas. 
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In this subsection we have surveyed a large number of issues involving 
systems over a 2-D parameter space. We have seen that a number of 1-D con- 
cepts can be extended to the 2-D case (e.g. 2-D FIR implementation schemes 
using the FPT) , while others cannot (e.g , , cascade structures) . In many cases 
there are several possible extensions from 1-D to 2 -d (as with the several 
notions of causality and the variety of directions of recursion ) , and in most 
situations the 2 -d counterparts of 1— D results are far more complex (as with 
the 2 -d stability tests) . We have mentioned several of the reasons for dif- 
ficulties in 2— D — difficulties in defining notions of causality, recursibility, 
and "state" (local or global) in 2 -d, the absence of a 2-D factorization theorem, 
and the absence of the Haar condition. Also we have speculated on a wide range 

t 

of open problems in such areas as filter design, filter structures, the 
accompanying issues of storage, sensitivity, and roundoff effects, and the 
development of useful state space models and tools such as the 2-D Lyapunov 
equation. In the next subsection we will open up several additional issues 
involving 2-D random processes. 

D.2 Image Processing, Random Fields, and Space-Time Systems 

Digital process3Jig of images for data compression, noise removal, or 
enhancement is one of the ma^or areas of applications of 2-D digital signal 
processing techniques. In addition, image processing has spurred a great deal 
of work in the analysis of spatially-distributed stochastic variables — 
random fields. In this subsection we will discuss some of the work concerning 
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xmage processing and random fields and will point out what we consider to 
be several particularly intriguing areas for further work. The reader who 
IS interested in obtaining a detailed understanding of image foinnation and 
processing and of the response of the human visual system should consult the 
references. In particular, we refer the reader to the survey paper of Hunt 
[D~4] , the book written by Andrews and Hunt [D-81] , and the paper of 
Stockham [D-82] . We will refer to these references often as we sketch some 
of the issues involved in image processing. 

Let g(x,y) denote the image radiant energy as a function of two spatial 
variables, where, for the time being, we will assume that the system is free 
of noise. The image results from an image formation process that transforms 
the original object radiant energy f{x,y) into the observed image. A general 
model that is often used for the image formation process is 

4^50 -{-CO 

g(x,y) = / / h(x,y,x ,y ,f(x ,y.))dx dy (D.56) 

_oo _oo XX XX XX 


Although in some cases the formation process may be nonlinear {see [D-81J for 
examples) , in many cases it is valid to assume a linear model 




+00 


g(x,y) = / / 


h (x, y , ,y^ ) f (x^y^ ) 


(D.S7) 


Here h(x,y,x^,y^) is called the point- spread function (PSP) , as it represents 
the image that results from a point source located at 
(i.e. f(x,y) = 6 (x-Xj^) <S (y-Yj^) ) . 
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This functxon models the smoothing and blur that take place in the image 
formation process. Sources of such blur abound. See [D-4,19,24,65,81] for 
detailed discussions of some of these. Examples include blur due to motion, 
defocused systeims, and the effects of atmospheric turbulence. 

The model (D.57) represents a spatially-varying 2-D linear system. In 
some cases, one can take advantage of simplifying assumptions, such as 
shif t-invarianc e 

h(x,y,x^,y^) = h(x-x^,y-y^) (D.58) 

separability 


h(x,y,x^,Y^) = h^ (x,x^)h 2 (y,Yj^) (D.59) 

or both 

h{x,y,x^,y^) = h^ (x-x^)h 2 (y-y^^) (D.60) 

2^ one might expect, these siitplifications lead to gains in analytical 
tractability and computational efficiency. 

It IS clear that the continuous-space model of (D.57) is inappropriate 
for digital storage or processing of images, and one usually obtains a 
discrete model by sanpling the left-hand side of (D.57) and by approximating 
the right-hand side using some type of quadrature formula (see [D-4,23,8i] 
for discussions of the errors involved in this approximation) . One then 
ends up with a model of the form 
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^ ] h (x f ^ (D.61) 

where the g(x,]) form the 2-D xmage array, the f(k,il) form the object array, 
and the h(x,j,k,fc) form the dxscrete poxnt-spread functxon. Note that the 
sxraplxfxcatxons (D.58)- (D.60) can also be imposed xn the dxscrete domaxn. 

For excimple, shxft xnvarxance yxelds the 2-D convolutxon 

g(io) = Z) h(x-k,j-Jl)f(k,W (D.62) 

k, 

Most dxgxtal xmage processxng schemes xnvolve the analysxs of equatxons (D.61) 

14 

or (D.62) , and we wxll spend most of our txme wxth them. As all xmages of 

xnterest are of fxnxte extent, we assume that the range of x,j,k, and H xn 

15 

(D.61) and (D.62) xs 1,.,.,N. 

In addxtxon to the xmage formation process, one must take xnto account 
the process of image recording and storing. As discussed xn lD-4,81,82], two 
well-developed and related xmage models for photographic xmages are the 
intensity and density images, which are related xn an essentially logarithmic 
manner. Let g^(x,y) be the intensity of light reflected from a photographic 
film on which is stored the image represented by the intensity function g(x,y). 


We refer the reader to [D-81] m which a mixed continuous-discrete digital scheme 
IS discussed. The image g is sampled, but the continuous form of the right-hand 
side of (D.57) is left intact. Spline approximations are used to estimate the im-_ 
age between sanples. 

15 

There is no loss of generality in assuming a square picture, as we can always 
pad a rectangular image array with zeroes in order to make it square. 
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Then (see [D-4] ) the xntensxty xmage model xs 

g^(x,y) = (x,y> [g(x,y) ( d.63) 

where Y is known for the gxven type of fxlm (xt essentxally controls contrast ) , 
and N(x,y) is fxlm qraxn noxse due to random fluctuatxons of sxlver density on 
the fxlm. On the other hand, the densxty xmage model is essentially the 
logarithm of (D.63) 

g,(x,y) = Ylogtg(x,y) ] +n,(x,y) (D.64) 

a a 

As described in [D-4, 81], the complexities of these models have been avoided 
in most cases. Equation (D.63) has been replaced by an additive model 

g^(x,y) = g(x,y) + n^{x,y) (D.65) 

while the low contrast assumption [D-4, 81] has been used to justify replacing 
(D.64) with 


g (x,y) = Yg(x,y) + n (x,y) (d.66) 

d a 

It IS not our purpose here to justify these models and assuii 5 )tions, and we 
refer the reader to the references for more details of the modelling of 
imaging systems. 

Given the above discussion, we now have the following mathematical model: 
a discretized object f(i, 3 ) and "noise-free" image g(i, 3 ), where i,]=l,...,N^ 
and f and g are related by (D.61) or (D.62) ; an observed image 


q(i,3) = g(i,3) + v(i,d) 


(D.67) 
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where v xs and addxtxve noxse process. We now turn our attentxon to the 
analysxs of thxs model. We wxll return to consider the nonlinear models 
(D.63 ) , (d. 64) somewhat later. 

At various points in this development, it will be more convenient to view 
and V as vectors by performing a scan (lexicographic) ordering. For 

example 

f(l,l) 

. f(l/2) 

• 

f(l,N) 

^ " f(2,l) 

f (2,N) 

a 

f (N,N) 

where = (f (i,l) (i,N) ) , In this case the relevant equation is 

q = Hf + V (D.69) 

2 2 

where H is an N xN matrix formed from the PSF. Examination of (D.6l) and 
(d. 68) yields the following form for H 


N 


(D.68) 


3JS 

This noise may include more than film grade noise. Specifically, the effects 
of light from sources other than the object can be included in v. 
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r H. 


11 


H = 


H 


21 


\l 


H, 


12 




H 


22 


» ♦ « • O 


H, 


2N 


H 


N2 


H , 


NN 


(D.70) 


where is NxN and its (m,n) element is h(i,m,D,n). if the imaging system 
IS shift-invariant — i,e. if (D.62) holds, it is readily seen that H is 
block Toeplitz — i.e. 


H 

3-D 


H 

1-3 


(D.71) 


* 1 7 

and, in fact, each of the blocks is itself a Toeplitz matrix. This fact 


wall be extremely important when we discuss the computational aspects of 
certain processing algorithms. Note also that if H is separable, than 


H = @ 


(D.72) 


vfliere denotes the tensor or Kronecker product, and A^ and A^ are NxN 
matrices given by 


Note that all that is needed for (D,71) is "horizontal stationarity"— i.e. 
h(i,m,g,n) = h{i-g,m,n). Vertical stationarity in turn implies that each 
block is Toeplitz. 



-192- 


A == 
1 


r h (1,1) h ( 1 , 2 ) .. h ( 1 ,N) -! 

' X 1 X ' 


h^(2,l) h^(2,2) .. h^(2,N) 


L h^(W,l) h^(N,2) .. h^(N,N) J 


where 


h(x,;],m,n) = h^(x,m)h 2 (] ,n) 


(D.73) 


(D.74) 


Note that horizontal stationarxty xmplies that A^ xs Toeplitz, while vertical 
stationarity implies that A 2 is Toeplitz, 

It is evident from the preceding development that probabilistic and sta- 
tistical methods must play some role in image processing. In this context, 
f/lrV, and perhaps h are random fields . Such a random field s(i,]) is 
characterized by some type of statistical description — the ]oint density of 
the values of the field at different points or perhaps a statistical model 


such as a 2-D ARMA model. We will consider some of these more complex des- 
criptions at a later point, but for now all we will use is the mean and 
covariance 

3 ( 1 , 3 ) - E[s(x, 3 )] (D.75) 

r(x, 3 ,m,n) = e{ [ s ( 1 , 3 ) -s ( 1 , 3 ) ] [s {m,n)-s (m,n) ] } (D.76) 
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The field will be called (wide-sense) stationary if 

r(i, 3 ,m,n) = r(i-m,]-n) (D.78) 

Note that if s and s are ordered lexicographically, then 

E [ (s“s) (s— s) • 3 = R (D.79) 

2 2 

where R is the N x N matrix obtained from r in the same manner that 
H in (D.70) IS obtained from the PSF h. We also observe that R is block 
Toeplitz if s IS stationary in the horizontal direction, and each block is 
itself Toeplitz if we have vertical (and hence full) stationarity. In 
addition, if the covariance is separable 

r(i, 3 ,m,n) = r^ (i,m)r 2 { 3 ,n) (D.80) 

we can obtain a representation for R much as the one for H in (D,72) . Note 
that in some sense (D.80) says that correlations in the data have horizontal 
and vertical as "preferred directions". While this may be reasonable in some 
cases (perhaps for cases in which one variable is space and the other is time) 
and may be acceptable in others (because it leads to mathematical tractability 
and good results) , in many cases the assumption of (D.80) may be totally in- 
appropriate, We will comment on this further later in this section. 


This IS not quite standard, since one usually also requires s(i, 3 )=constant. 
Clearly any process which is stationary in our sense cein be transformed into 
one in this stronger sense by subtracting out the mean. 
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One xn^ortant problem xn xmage processxng xs the effxcxent representatxon 
of xmages for storage or transmxssxon [D-4,3l,37,76,241] . For such applxcatxons , 
one wxshes to represent the xmage wxth as few pxeces of xnformatxon as possxble 
but wxth a reasonable level of accuracy* Intuxtxvely, one then wants the 
redundancy xn the pxeces of xnformatxon kept to a mxnxmvim. Suppose we are 

19 

gxven an xmage s wxth covarxance R. The off-dxagonal elements of R tell us 
how much correlatxon there xs among the varxous pxxels ("pxcture elements" — 
x.e., components of s) , and thxs correlatxon can be xnterpreted as a measure 
of the redundancy xn the pxctture. One method for obtaxnxng a less redundant 
representatxon xs to transform s 

0 = Ts (D.81) 

(where T ^=T ' ) so that rhe covarxance of 0 

S = TRT' (D.82) 

xs dxagonal — x.e. T xs the matrix of exgenvectors of R and the components 
of 0 are uncorrelated. Thxs trans forma txon xs called the Karhunen-Loeve 
transform, and xts use xn effxcxent codxng can be seen as follows (see, for 
example, [D-4]). Let us order the exgenvalues of R xn order of decreasxng 
magnxtude. Then we store or transmxt only those components of 0 correspondxng 


19 2 

Exther an NxN array or an N vector. We shall use these two forms xnter- 
changeably and wxthout comment unless there xs a chance of confusxon. 


C-3 
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to the M<N largest eigenvalues. We are guaranteed to have retained those 
"coordinates" of the image that contain the most information, and we can obtain 
an approximate image by inverting the transform; 


s = T'a (D.83) 

A 

where CT is formed by setting to zero those components of a that were discarded. 
We can in fact decide how many terms to keep on the basis of the size of the 
reconstruction error 

A 

e = s-s (D.84) 

As discussed in [D-4,37], the Karhunen-Loeve transform leads to a very 
efficient coding scheme. However, in general, this transform involves 
exorbitcint amounts of computation, we must find the eigenvectors cind eigen- 
values of R {usually ^ust once off-line for a class of images with the same 

I 

covariance) , and then we must perform the transform coding (D.81) or decoding 
(D.83). Uhis can involve a great deal of on-line computation (see lD-4,37] 
for estimates) , since there is no "fast" method for performing this transform, 
in general. There are, however, several special casffi in which this transform 
can be calculated efficiently. One of these [D-241] involves the use of a 
more detailed model of the image as a random field, and we will defer discussion 
of it until we begin our treatment of more detailed models for fields and 
images. Another case, motivated by similar analysis performed by Hunt [D-4,46] 
cind Andrews and Hunt [D-81] , is quite instructive, and, as we will use this 
idea on several occasions, we will develop it here in detail. 
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Suppose that s is stationary. Then R is a block Toeplitz matrix with 
Toeplitz blocks. Following [D-81] , suppose further that a particular pixel 
is correlated with a number of surrounding pixels, but is uncorrelated with 
ones some distance d away (Andrews and Hunt cite d=20-30 pixels as a typical 
nunber) , Then the block Toeplitz covariance matrix takes the form 






We now modify R and the R to make R block circulant and R circulant. A 

1 1 

block circulant matrix is block Toeplitz with each row a cyclic shift to the 
right of the preceding one, where the last block on the right of one row 
becomes the first block on the left in the next row.^^Examining (D.85) , (D.86), 
we see that this merely means replacing some of the zeroes with nonzero entries. 

The reasons for doing this and its interpretation can be found in the 
following observations: 


1. Let R^ denote the circulant approximation to R, and let 

be the matrix of eigenvectors of R^. Then the product 

T s can be computed efficiently using the fast Fourier 
c 

transform. This is shown in Appendix 2 and is the reason 
for using this approximation. 

2. For N large compared to d, [ |r-R^| | is small, where 
I j • I I is any matrix norm. In addition, this error 
can be made arbitrarily small by choosing N large 
enough (see [D-81] ) . 

3. Let us see what the circulant approximation means. 

For R to be block circulant, we must have that 
c 

r(i, 3 ,m,n) = r [ (i,j)modN,m,n] (D.87) 

Intuitively, instead of thinking of the image as a 
flat array, think of it as a cylinder, so that ho- 
rizontal distance matters only modulo N. Furthermore, 
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if we also have that each block is itself circulant, 

we should think of the image as the surface of a torus 

, 20 

{connect the two ends of the cylinder) . See Figure 
D.9 for an illustration of this. 

As discussed in [D-37] the Karhunen-Loeve expansion can also be per- 
formed quickly if the covariance is separable. In this case, we perform the 
expansion separately in the horizontal and vertical directions — essentially 
1-D transforms on data records of length N. Hence in the stationary or 
separable cases, there appear to be relatively efficient methods to perform 
the transform. However, motivated by the complexity of the general Karhunen- 
Loeve expansion, researchers have applied other, more efficient transfom 
techniques such as the PPT and the Hadamard transfoinn to the problem of image 

V 

compression and coding (see [D-4,37,81] for a discussion of several of these) . 
Many of these work nearly as well as Karhunen-Loeve [D-4] . This is not sur- 
prising given the preceding discussion concerning circulant approximations. 

As discussed in Section B, one of the most widely used coding or 
compression schemes for 1-D time series, such as speech, is linear prediction, 
in which we design a one-step predictor or' inverse whitening filter (depending 
upon your point of view) for the time series. This method has several 
appealing features in 1-D — it is efficient (if one uses the Levinson algorithm), 
it leads to recursive coding and decoding algorithms, and it yields excellent 
performance. In 2-D the situation is not nearly as clear. What direction 


20 

Seviora [D-114] and Seviora and Sablatash [D-115] dealt with a general frame- 
work that included transforms on cylindrical and toroidal spaces for the purpose 
of digital signal processing. 
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Figure D«9; Illustrating the Circulant Approximation 
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do we predict in and what old data do we use to do the prediction? Genin 
and Kaitip [D-144,l45] have shown that 2 -d least-squares inverse filters need 
not be stable; can this problem be overcome? Are there efficient 2 -d algo- 
rithms along the lines of Levinson's method? We will address some of these 
questions later as we develop more detailed stochastic models. Let us point 
out here, however, that for a particular ordering of the points in a 2-D 
array, Habibi tD-76] and Habibi and Robinson [D— 37] have obtained encouraging 
results using a predictive encoder. In comparison with transform methods, 
they found the predictive coding scheme to be superior as far as system com- 
plexity, time delay due to the coding operation, and coding performance at 
high bit rates, but the transform methods were more robust to errors in the 

knowledge of the image covariance and required lower bit rates. In addition, 

« 

Habibi and Robinson [D-37] suggest a hybrid scheme in which we transform 
the data horizontally line by line and then perform 1-D linear prediction on 
each column. They report that the performance of this system is excellent. 
These promising results and the questions mentioned earlier concerning the 
direction of prediction are sufficient to warrant further investigation of 
such methods. 

We now turn our attention to the problem of restoring bluinred and noise- 
corimpted images. Initially we will concentrate on the linear model (D.61) , 
(D.62) , (D.67) or, equivalently (D.69). For details concerning these methods 
we refer the reader to the references and in particular to the survey papers 
[D-4,19,38] and the text [D-81] . 
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One of the fxrst methods proposed for image restoration is aimed solely 
at the removal of the effects of blur and essentially ignores the presence of 
additive noise. This is the inverse filter 

f = H~^q (D.88) 

in the space-invariant case, (D.62) , we can take transforms 




H(z^,S2) 


(D.89) 


In addition, in this case H is block Toeplitz with Toeplitz blocks, and 
hence we can make the circulant approximation (assuming that the extent of 
the PSF IS much smaller than the size of the picture — see [D-81] ) and 
hence can take the DFT of (D.62) , yielding 


F (m,n) 


Q(m,n) 

H(m,n) 


vhere, for example. 


(D.90) 


H(m,n) = h(k,A)W~^“^" 

k,£=0 ^ 


(D.91) 


Note that as an alternative to making the circulant approximation, we can 
use the 2-D version of a standard 1-D idea — we embed the 2 -d acyclic' 
convolution (D.62) in a larger 2-D cyclic convolution by padding each row 
and column with a sufficient number of zeroes. Equivalently, we intersperse 
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zeroes xn the appropriate places in the lexicographically ordered vectors 

A 

q, f, etc., and in the block matrix H [D-46] . The resulting matrix H ^ 
block circulant with circulant blocks (see Appendix 2 for the correspondence 
between circulant matrices and cyclic convolution). Thus, we can directly 
apply (D.90) with no approximation to this padded image. 

Let us make several comments concerning the inverse filter. First of 
all, the image formation process (D.61) , (D.62) may not be invertible, and 
thus, we cannot even perform the calculation indicated by (D.88) . One might 
consider using a pseudo-inverse, and we will discuss this in the context of 

J 

another restoration methodology, in addition, examining the transformed 
versions (D.89) , (d, 90) , we see the possibility of two further problems. 

The frequency response K usually falls off at high frequencies. Thus, as- 
siaming that high frequency noise is present, we may observe extreme noise 
amplifications, in addition, the inverse filter transfer function flows up 
at the zeroes of H, and this can cause severe difficulties. Looking at these 
equations in the space domain, Sondhi [D-19] , Hunt [D-4] , and Andrews and 
Hunt [D-81] argue that the difficulty arises due to the severe problems 
encountered in a attempting to invert integral equations such as (D.57). In 
the discrete domain, this implies the ill-conditioning of the matrix H, and 
thus, even if its inverse exists, the solution suggested by (D.88): 

f = f +'H“^v (D.92) 

may be dominated by the noise. 
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In order to overcome difficultxes such as these, one must explicxtly 
take the presence of noxse xnto account, Thxs leads to the dxscrete Wxener 
filter formulatxon [D-4,17,19,38,40,81] . Consxder (D,69) wxth 

E(ff')=P , E(w')=R, E(fv*)=0 (D.93) 

A. 

and suppose we wxsh to choose our estxmate f as the mxnxraum mean square 
error (MMSE) estxmate 


mxn E [ (f-f ) • (f-f ) ] (D.94) 

A 

f 

If we Ixmxt ourselves to Ixnear transformatxons on the data or xf we assume 

21 

Gaussxan statxstxcs, we obtaxn the optxmal estxmate 

f = PH‘ (HPH'+R)“^q (D.95) 

22 

Agaxn let us note that in the space- xnvarxant, zero-mean, statxonary case. 


21 

The Gaussxan assumptxon can clearly only be made for convenxence, sxnce we 
know a prxorx that all components of f must be ^0. We note that although 
thxs elxmxnates the Gaussxan assumptxon xn theory, xn practxce one often makes 
xt anyway, sxnce xt leads to tractable problem formulatxons and acceptable 
system performance (see, for example, [B-104] , where the same type of posxtx- 
vxty asstJmptxon was encovintered) • 

22 

The zero mean assumptxon xs xncluded to guarantee the block-Toeplxtz 
structure of P and R. If we have nonzero means for f and v, we can subtract 
out thexr effects from (D.69) and proceed wxth the analysxs. In thxs case, 
the estxmate produced by (D.96) xs the estxmate of the devxatxon of f from 
xts a prxorx mean. 
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we can perform (D.95) in the frequency domain, obtaining an expression ana- 
logous to (D,89). In addition, in this case all of the matrices are block 
Toeplitz, and we can use the same block circulant approximation to obtain an 
expression analogous to (D,90) : 


A 

F(m,n> 


H* (m,n)Q(m,n) 
2 $ 

H{m,n) + V 


(m,n) 

(m,n) 


(D.96) 


where denotes conplex conjugate, is the 2 -d DFT of the noise covariance, 
and IS the DFT of the image covariance. 


Note from {D.95), {D,96) that the problem observed with the inverse 
filter has been removed -- i.e. the inverse in (D.95)‘ and the denominator 
in (d. 96) won’t blow up, since we have explicity included the effects of noise. 
The Wiener filter does, however, have some difficulties and limitations as an 
image processing system. To a great extent this is due to the fact 'that the 
MMSE criterion is not particularly well-suited to the way in which the human 
visual system works (see Stockham's paper [D-82] for a discussion of the 
visual system). In particular, the Wiener filter is overly concerned with 
noise suppression. In addition, in order to make the filter computationally 
feasible, one often assumes stationarity. This in turn leads to a filter that 
IS insensitive to abrupt changes-- i.e. it tends to smooth edges and reduce 
contrast. On the other hand, m high contrast regions, the human visual system 
will readily accept more noise in order to obtain greater resolution. Thus, 
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the Wiener filter sacrifices too much in resolution in favor of noise sup- 
pression. We will return to this key image processing tradeoff later in 
this section. 

Another difficulty with the Wiener filter is the amoimt of a prion 

information that is required. For the inverse filter all we need is the 
23 

PSP, while for the Wiener filter we need the PSF and the second order 

statistics of the original image and the noise. This is a great deal of 

\ 

information to assume to be known, and a serious question here concerns the 
robustness of the Wiener filter to errors in this a prion knowledge. 
Several schemes have been proposed that are aimed at trading-off 
between the potentially high?-resolution, poor noise performance of the 
inverse filter and the lower-resolution, good noise performance of the 
Wiener filter. One of these is the constrained least squares filter , sug- 
gested by Sondhi [D-19] and developed and discussed by Hunt [D-46] and 
Andrews and Hunt [D-81] , in this formulation, we wish to choose f to mi- 
nimize 


J(f) =ft:'Cf (D.97) 

subject to the constraint 

(Hf-q) ' (Hf-q)=e (D.98) 


23 

As discussed in [D-81] , the PSF is usually assumed to be known, and for certain 
types of blur, this is a reasonable assumption. However, in many cases, either 
the entire PSP or several of its parameters are not known a priori and must be 
estimated. We will discuss this shortly. 
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The solution is 


A 

f = 


(H'H + yc'c) 



(D.99) 


where y is a. Lagrange multiplier found by iteration in order to satisfy 
(D.98) . Again one can obtain transform versions of (D.99) in the shift- 
invariant case. 

Several comments are in order concerning this approach, which has been 
shown in several experiments to perform at a level superior to that of the 
Wiener and inverse filters [D-4,811, Note first of all from (D.99) that we 
have eliminated the need for covariance information for f and v. In addition, 
by adjusting the size of e in (D.98) (or equivalently of Y m (D.99>), we 
can effectively control the amount of noise suppression. iU.so, we have 
some freedom in the choice of C, and several possibilities and their inte::- 
pretations are discussed in [D-81] . For example, choosing C=I, essentially 
leads to a "pseudo-inverse” filter -- i-.e. this filter reseiiifles the inverse 
filter but avoids the illconditioning by adding yi to H’H before inverting. 

In addition, one can choose C as a "finite difference matrix,” which leads 
to our minimizing scane measure of the rate of fluctuation in the estimated 
image. One can also choose C in order to match the characteristics of the 
human visual system [D-4] , and the choice p 




(D.lOO) 


leads to a "parametric Wiener filter," closely resembling (D.95) in structure 
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Another approach, proposed by Stockham, et.al. [D-4,8l,E-4] , leads to 
a filter that is the geometric mean of the inverse and Wiener filters (hence 
It directly trades- of f -between the properties of ^ese systems): 


F(m,n) 


1/2 


\ 2 $ (m,n) 

H(m,n) + V 


Q(m,n) 


0^{m,n) 


(D.lOl) 


This filter, obtained by designing a system so that the output power spectral 
density equals that of the original image, has worked extremely well in 
several experiments [D-4,81,E— 4] . We note that (D.lOl) is not precisely cor- 
rect, as it does not include the phase effect of the restoring filter. Since 
phase IS extremely important in image processing and viewing, one must take 
it into account. This has been done for several specific types of PSF*s, 
and we refer the reader to [E-4] and the references therein. In addition, in 
examining (D.lOl) it appears that we again require a great deal of a prion 
information; however, this particular filter is particularly well-suited to 
the use of on-line estimates of quantities such as the PSF. We will discuss 
this in more detail later in this section, and we refer the reader to [E-4] 
for details. 

In addition to these techniques, a number of other approaches along 
these lines have been developed, arid we refer the reader to the references 
for details. At this point we want to make several observations concerning 
these processing systems. Note first of all that they are nonrecursive and 
in principle require the block processing of the entire image or substantial 
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sectxons of the image [D-473 . Hence the computational burden of these schemes 
can be quite high. In the shift-invariant, stationaiY case this problem can 
be somewhat alleviated with the aid of FFT techniques, but the required amount 
of calculation is still substantial. The situation is even more complicated 
if the PSP IS shift- varying. Examples of such imaging systems are given by 
Sawchuk [D-65] and Robbins and Huang [D-20] . In his paper, Sawchuk suggests 
breaking the PSF into shift-invariant pieces, followed by the use of some of 
the techniques we have discussed. Sawchuk and Robbins and Huang also discuss 
the possibility of inverting nonlinear distortions in the imaging system, 
followed by the use of shift-invariant methods. Clearly the PSF must be of a 
special form for this to be possible. 

The use of the FFT or the inversion of nonlinear distortions notwith- 
standing, it IS clear that the processing methods described so far require 
a great deal of on-line calculation. In 1-D, one finds that recursive 
methods are often preferable to nonrecursive ones because of their computa- 
tional advantages. Although the situation is not as cleair in 2-D (as we 
saw in subsection D.l) , it certainly seems worthwhile to investigate recursive 
2-D image processing methods. As discussed in [D-81] the 1-D Kalman filter 
offers great computational savings over nonrecursive methods, and an appealing 
question IS the extension of such filters to 2-D. Anyone familiar with 1-D 
Kalman filtering theory realizes that the design of the filter relies heavily 
on a dynamic — i.e, recursive — representation of the received signal. Hence, 
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to develop such, techniques in 2 -d, we need more complex models of images 
than that provided by the mean, and covariance. The need for the use of such 
models is an obvious drawback to this approach, but the potential gains in 
computational efficiency represent a distinct advantage. We now will describe 
several of the approaches taken in the application of recursive estimation 
techniques to 2-D processing. This research topic is still in its early 
stages of development, and many open questions remain. 

One approach to recursive processing of images involves the 1-D pro- 
cessing of the scan-ordered image (see Section D.l). This work has been 
developed by Nahi, Silverman, and their colleagues [D-8,18,21,24,58,174] . 
Suppose we have an image f(m,n) (assumed to be zero mean for convenience) 
with stationary covariance 

r(k,f) = E[f (m,n)f (m+k,n+il) ] (D.102) 

Suppose we observe 

q(m,n) = f(m,n) + v(m,n) (D.103) 

where the additive noise v is, for simplicity, assumed to be zero mean 
and white, with 

E[v(m,n) v(k,A) ] = (D.104) 

We now take the scan ordering of the NxN grid on which q, f, and v are 
defined, bet us use the same symbols to denote the resulting 1-D processes. 
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We then have 


q(k) = f(k) 

+ v(k) 

(D.105) 

Elf (k)f(il)] 

= S(k,i) 

(D.106) 

Etv(k)v(&)] 


(D.107) 


where s(k,il) can be calculated from knowledge of r(m,n). 

Note that the scanned image f (k) is not stationary, 3 ust as in Section 
D.l we found that scanned 2-D systems did not become time-invariant 1-D 
systems. The problem is clearly due to the abrupt change that occurs when 
the scanner reaches the end of one line and begins the next. For example, 
it IS clear that one will have 

S ( 1 , 1 + 1 ) = S ( 1 + 1 , 1+2) = r(0,l) 

if and only if 1 , i+l, and i+2 come from the same line of the image. On 
the other hand, it is clear that 2-D stationarity plus the periodicrty 
of the scanner should yield some structure for S, and, in fact, it is easily 
seen that 


S(k,Jl) = S(k+N,Jl+N) Vk,£ (D.108) 

A process with this property is called cyclos tationary , and many of its 
properties have been analyzed in detail [D-43,80,83] . 

Given the model (D.105)- (D.107) , one wishes to use Kalman filtering 
techniques in order to suppress the noise. In order to do this, we need a 
state space model for f. That is, we have a stochastic realization problem: 
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find a finite-dimensional linear system driven by white noise that yields 
an output with correlation function given by (D.106) . Unfortunately, as 
pointed out in [D-21] , S(k,A) does not have the required separability that 
IS needed in order for such a realization to exist. Hence, some sort of 
approximation is needed, and several have been developed. The’ simplest of 
these involves finding a stationary approximation to (D.106), much as Manry 
and Aggarwal found shift-invariant approximations to the shift-varying 
scanned filters they studied in [D-56], The basic idea here, due to Franks 
[D-43] , IS to use to stationary covariance 


R(k) 


1 

N 


N 

S (m,mtk) 

itFl 


(D.109) 


This is equivalent to randomizing the variable m over the scan of one line in 
the computation of E [f (m) f (m+k) ] . 

Having R(k) , one can then use some realization procedure to find a 
Markov model 


x(k+l) = Ax(k) + CO(k) 

(D.llO) 

f (k) = c‘x(k) 

(D.lll) 

E[a){k)w(g) ] = 

(D.112) 


that realizes or approximates the given correlation function. We refer the 
reader to [D-18] for a method used by Nahi and Assefi. 
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We can now obtain an image restoration scheme by direct application of 
Kalman filtering to the model (D.105) , (d.107) , {d. 110)-(D.112) . Several 
comments are in order. We first note that the filter has an artificial 
causality — only the points above and to the left on the same line affect 
the estimate of a given pixel. This can be partially removed by smoothing 
the data — i.e. by estimating each f(k) based on all the data. With the 
model we have developed, this can be done efficiently with two Kalman filters, 
scanning in opposite directions and starting at opposite ends of the image. 

The resulting estimate still has difficulties because of the randomizing 
used to obtain (D.llO)- (D.112) , This causes problems much like those caused 
by Manry-Aggarwal ' s shift-invariant approximation. In this case, one can 
remove some of these difficulties by transposing the' image and performing the 
same type of processing again (2 more Kalman filters scanning in a direction 
orthogonal to the other 2 filters). This appears to be reminiscent of Pistor's 
four quadrant decomposition [D-42] — we have NE, NW, SE, and SW Kalman 
filters. 

A number of other comments can be made concerning this approach to image 
processing. First of all, like the Wiener filter, the Kalman filter is based 
on a MMSE criterion, and hence we can expect it to sacrifice resolution for 
noise suppression. In addition, this method relies heavily on a priori 
knowledge of the image covariance, and the robustness of the approach in the 
presence of modeling errors remains an open question. We have already com- 
mented on the problems inherent in the stationary approximation of the cyclo- 
stationaiy covariance of the scanned image. In [D-21] Nahi suggests that one 
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use a piecew3.se stationa3ry approximation over various sections of each 
scanned line. This leads to a time- varying , piecewise-constant state variable 
description for the scanned process. 

Several alternative methods exist for reducing the affect of the sta-“ 
tionairy approximation. Nahi and Franco [D-58] suggest the simultaneous 
scanning of a mnmber of lines ("vector scanning") . One can then model cor- 
relations both along the scan and along the components of the vector of the 
scan. If one scans all lines simultaneously, we can take all of these cor- 
relations into account. Note that in this case we have turned a 2 -d, scalar 
signal into a 1-D, multivariable signal, much as we discussed in the preceding 
subsection. Of course, this leads to problems with the dimensionality of 
the resulting p3XJcessor. Thus, Ncihi and Franco suggest a "section-scan", 
scheme which is in fact far more efficient than the scalar system described 
previously. This sectioning approach is much like that of Manry and Aggarwal, 
in which a number of lines are processed together, and different sections are 
processed independently. An interesting point here is that Maniry and Aggarwal 
discussed the use overlapping sections to avoid problems at the edges. A 
similar approach might work well in the framework developed by Nahi and Franco. 
We note, however, that the vector modelling in [D-58] requires the separability 
of the image covariance. In fact, Nahi and Franco [D-58] and Franks [D-43] 
argue that a good model to be used is the exponential model 


r(m,n) 



n 


(D.113) 
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The necessxty for using separable covariances is clearly a limitation/ 
but It ^oes allow one to obtain detailed results . In addition to the work 
mentioned above, Powell and Silverman [D-8] used the separability assumption 
on r(m,n) to develop exact dynamic models for each line of the scalar and 
vector scan processes. These models involve time-delays in the output equa- 
tion (due to the nonseparability of S(k,Jl)), and the dimension of the models 
increases in proportion to the width of the scan. This last fact is not 
surprising, since we saw in stibsection D,1 that the dimension of the global 
state of a 2 -d system grows in proportion to the extent of the plane on 
which the system is defined. 

The recursive methods discussed so far have assumed that there is no 
blurring due to a nontrivial PSP. If there is such blurring, essentially we 
must develop a 1-D dynamical model for the effect of the blur along the scan. 
The simplest example of this — motion blur along the direction of the scan 
— was considered by Sboutalib and Silverman [D-24] . in the absence of 
noise, they design the line-by-line inverse system to remove the blur both in 
the space-invariant and space-variant cases. The inverse they propose is a 
recursive one, and hence can be implemented with relatively small computational 
demands. If noise is present, one augments the scalar or vector scan dynamic 
models of Nahi, Assefi, and Franco with the dynamic model of the blur, and 
uses the Kalman filter line by line (or section by section) to remove the blur 
and to suppress the noise. Again this system offers computational advantages 
over nonrecursive schemes, but the inverse system may be very sensitive to 
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errors in the knowledge of the PSF. The robustness properties of the Kalman 
filter in this case are not yet known. 

All of the recursive scan techniques have basically been one-dimensional, 
in that no 2-D model for the image (beyond the usual covariance description) 
has been used. Recentlyr however, a number of researchers [D-6, 22, 34, 35, 71, 96, 
148,173,174,229,236] have considered 2-D recursive models for images. The 
first work along this line was that of Habibi [D-22] who considered the sepa- 
rable covariance function given in (D-113) . Habibi noted that this covariance 
could be obtained from a 2-D, recursive, auto-regressive shaping filter 


x{k+l,il+l) = p^xCk+l,)!.) + p^x(k,A+l) - p^p^xik,)?.) + >/(l-p^) 


A) 




where w(k,)l) is a white, zero mean process with 

E[w{k,A)w(m,n)l = 6, 6. (D.115) 

km jGn 

Assuming measurements of the form 

y(k,8r) = x(k,£) + v(lif,A) (D.116) 

Habibi then developed an estimator to estimate x(k+l,£+l) based on 

{y(m,n) jm£k, n<A} — i.e. this estimator is a one-step NE predictor. Habibi 
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chose an estxma-tor structure of the form 


x(k+l,Jl+l) = p^xCk+ljJl) + p^x(k,)J-+l) - p^p^xCk^W 

+ F(k,il} [y (k,i!<)-x(k,5,) ] 


(D.117) 


and determxned a value for the gam P{k,Jl). Unfort^lnately, this estimator 
IS sioboptimal, as pointed out by Strintzis [D-165] . The problem is that in 
1-D Kalman filtering, it is well-known that in order to obtain the optimal 
estimate recursively, one must estimate the entire state of the process. 
However, as discussed in the preceding section, the global state has dimension 
proportional to the extent of the 2 -d domain under consideration. Hence 
x(k,Jl) IS not the global state, and we cannot ej^ect its estimate alone to 
suffice for recursive optimal estimation. In fact, as Morf, et.al. [D-162, 
163] point out x(k,Jl) is not the complete local state, and this makes the 
meaning of (D.117) even more questionable. Still, as Strintzis mentions, the 
structure of this estimator is so simple and intuitively appealing, it would 
be worthwhile to determine ^ust how suboptimal it is. 

The most complete study of optimal 2-D Kalman filtering has been per- 
formed by Woods and Radewan [D-173,229,236] . We assume that we have a one- 
sided causal dynamic model (see Fig. D.5, d, 6. Equation (D.12)) for the random 


field 
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M +M 

x(m,n) = '^ ^ b (k,il)x (m-k,n-£-) 

k=l il=-M 

„ (D.118) 

+ b (0 , il) X (m,n-£) + w(m,n) 


Thxs model can be assumed to be given or can be obtained from the image power 
spectral density by means of 2 -d spectral factorization [D-119], This latter 
method in general leads to infinite order factors which must be truncated. 

A third method for obtaining the model (D.118) is by direct parameter esti- 
mation using a method such as 2-D linear prediction. We will comment on 
methods such as these later in this section. 

Woods and Radewan consider the obseirvation equation 

q(m,n) = x(m,n) + v(m,n) (D.119) 

where v is zero mean and white with variance R. Suppose we want to estimate 
x(m,n). given all values of q in the past, where past is defined relative to 
the direction of recursion in (D.118) — i.e. 

li^-1, all 3 }U{q(m, ]) |]^n}. Woods and Radewan point out that this 
can be done optimally with an extremely high dimensional Kalman filter to 
estimate the global state of the system, which in this case has dimension on 
the order of MN (M=order of the filter, N=width of the image) . In fact, a 
valid global state is (see Figure D.IO) 


s(m,n)* = [x(m,n) ,x(m,n-l) ,...,x(l,m) ;x(N,n-l) ,...,x(l,n-l) ; 
...; X (N,n-M) ,x(ra-M,n-M) ] 


(D.120 



Future Points 


Figure D.IO; 



f(N,n-«) 


Illustrating the Global State of Woods-Radewan. 
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By inspection of the state and the recursion (D.118) , it is clear that v/e 
can v/rite a 1-D equation for the scan-ordered process. Note that this model 
will he time- varying, since we must take into account the initiation of a new 
line. We can also write a relation between q and s 

q(m,n) = Hs(m,n) + v(m,n) (D.121) 

where H merely picks off the first element of s. Given this development, 
a rather enormous Kalman filter can be written down. In addition, one can 
obtain a more efficient optimal estimator by processing one line of data at 
a time (see [D-229]). 

As developed in [D— 173,229,236] , this filter does not correct for image 
blur. However, it does appear that one can modify the development so that it 
can. Suppose our observation is 

P P 

t(m,n) = ^ h(3,k)X'(mr-j,n-k) + ^(n,m) (D.122) 

j=-P k=-P 

where 5 is additive white noise. Note that in terms of the ordering implied 
by the recursion (D.118) , t(m,n) involves values of x that occiir in the future. 
This can be corrected by a time delay of the observations 

q(m,n) = t(m-P,n-P) (D.123) 

In this case, assuming 2P<M, we may write a relation of the form of (D.121), 
where in this case v is a shifted version of and H is such that we obtain 


the proper blurring^ If 2P>M, we must increase the dimension of s — keep 
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more data from the past -- in order to make sure that s contains all the 

'components of x that affect q(m,n). We illustrate these ideas in Figure D.ll. 

From the figure it is clear that H* in (D.121) will not be constant, since we 

must take end-of-line effects into account, when portions of the diagonally 

shaded region in Figure D.ll lie outside the range of the image. This 

clearly can be done, and, as before, we obtain a giant Kalman filter. Another 

method for optimal Kalman filtering in the presence of blurring has been 

developed by Hart, et.al. [D-71] , They also use a global state for the filter, 

but they assume that the different pixels are all independent — i.e. that 

all of the b(i, 3 ) are zero in (D.118). 

Optimal line-by-line Kalman filtering for images has also been considered 

by Attasi and his colleagues tD-6,34,35,96] using a stochastic version of the 

model discussed in subsection D.l. Specifically, consider noisy observations 

/ 

of an image f(i,3) 

q(i,3) ?= f(i,D) + v(i,3) (D.124) 

where the image is assumed to be generated by a separable vector analog of 
the model used by Hahibi [D-22] 

x(i, 3 ) = P^x(i-l,j) + F^xd,]-!) - Fj^F 2 x(i- 1 , 3 - 1 ) + w(i-l,]-l) 
f(ifl) - Hx{i,j> 


(D.125) 
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We wish to obtain the optimal estimate x(m,n) of x(m,n) given q(irj) for 
1^ and all The optimal estimate in this case consists essentially of two 
1-D operations. Suppose we have x(m-l,n) for all n. We first predict ahead 
one line to obtain 

x(m,n) = Fx(m-l,n) Vn (D.126) 

Note that each of these estimates is calculated independently. We now 
observe the new line of measurements qCm,n) for all n, and we create the error 
process and the error measurement 

e(m,n) = x{m,n) - x{m,n) (D.127) 

y(m,n) = q(m,n) - iCTCm^n) = He(m,n) + v(m,n) (D.128) 

Thus we have a 1-D estimation problem -- estimate e(m,n) for all n, given 
y(m,n) for all n. Attasi shows that one can obtain a finite dimensional 
1-D realization for e(m,n) as a function of n. Hence, this estimation pro- 
blem reduces to the usual 1-D smoothing problem. The solution consists of 
two 1-D Kalman filters starting at opposite ends of the line. The estimates 
produced by these filters are then combined to produce e (m,n) and 

x(m,n) = x(m,n) + e(m,n) (D.129) 

For details, we refer the reader to the references. The "geometry” of the 
estimator is illustrated in Figure D,12. 

Let us make several comments concerning this estimator. First of all, 
we see that the decoupled structure of the estimator yields a far more 



n 


Figure D»12; 
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(a) PredicHng Ahead One Column 




(b) Processing the New Column of Data with 
Two Kalman Filters 

Illustrating the Structure of Attasi's Estimator 
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efficient estimator than that of Woods and Radewan. This is apparently due 

« 

to the separability of the underlying model (D.125) . Also, for this model 
it IS not clear if we can perform the same modifications in order to incor- 
porate blurring. This and the separbility restriction are obvious drawbacks, 
but the appealing structure of the filter is reason enough for further in- 
vestigation, especially given the coit^atability of this algorithm with parallel 
processing techniques. Furthermore, we note that the optimal smoother can 
again be implemented with two filters of the type devised by Attasi — one 
sweeping the columns in order of increasing m, and the other in order of 
decreasing m. Again, this is reminiscent of the decomposition of zero phase 
filters into two half-plane filters tD-42,119], 

The method of proof used by Attasi involves the taking of z-transforms 
along the n direction and the treatment of m as a time variable. Essentially 
we are regarding the 2 -d system as a high-dimensional (infinite if the domain 
of n IS unbounded) 1-D system, where we can use a spatial transform "along" 
the 1-D state vector in order to simplify the calculations. The key step in 
Attasi *s development is a derivation of a set of Riccati equations, parametrized 
by the transform variable z, for the power spectral density of e(m,n) 

considered as a function of n. One can then factor these spectra to obtain 
the 1-D realizations of the e*s. As Attasi points out, the dimension of the 
realization for e(m,n) is on the order of m times the dimension of x — i.e. 

It grows linearly with m. One can avoid this difficulty by using reduced 
order estimators. For example we may choose to use the steady-state filter. 
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in which case we can obtain a finite dimensional system whose spectrum 
approximates S^'(z) . 

Let us note that Attasi's work brings out several crucial issues. 
Specifically we have seen the effective use of the equivalent representations 
of signals as multivariable 1-D and scalar 2-D. We have also seen that 
transforms along one of these variables cein be useful in obtaining solutions. 
Later in this section we will discuss the relation between 2-D processing 
and distributed and decentralized control. The issues gust mentioned will 
be of great importance then as well. 

As we have seen, optimal 2-D Kalman filtering algorithms require large 
amounts of storage and con 5 >utation. Thus, a number of researchers [D- 34, 148, 
173,174,229,236] have developed suboptxmal estimators that require less com- 
putation, We will briefly describe several of these and refer the reader 
to the references for more on this subgect. Let us begin with~the technique 
of Woods and Radewan [D-173,229,236] . They consider two types of suboptimal 
filters. The first involves breaking the picture up into strips of width 
W<N. One then processes across and up these strips individually, much as with 
the Manry-Aggarwal seotion-scan. This reduces the dimension of the global 
state, as we replace N in (D.120) with w. Woods and Radewan also suggest 
overlapping the strips in order to avoid the edge effects caused by incorrect 
boundary conditions between strips. 

The other suboptimal filter developed in [D-229] is the reduced update 
Kalman filter. Examining the optimal filter of Woods and Radewan, we see that 
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the predxct cycle xs computatxonally straxghtforward — one sxiuply uses the 
recursion (D.118) assuming no noxse and usxng precedxng estxmates. The 
measurement update part of the optxmal fxlter, on the other hand, xnvolves 
updatxng the estxmates of all of the components of the state, Assumxng 
N»M, we expect that a gxven pxxel xs most correlated only wxth a small per- 
centage of the elements of the state vector. Therefore, xt seems reasonable 
only to update the estxmates of those components of the state that are wxthxn 
a certaxn dxstance of the point bexng processed. Thxs should .greatly simplify 
the fxlter wxth mxnxmal effect on performance. In other words, we are ^essen- 
txally desxgnxng a constrained Kalman fxlter xn which we constrain many ^of 
•the gain elements to be zero and essentially allow only "near neighbor updates." 
We remark that a isimxlar idea was proposed by Pratt I'D-173 for the Wiener fxl- 
ter and by 'Muiphy and Silverman [lD-174] in the Kalman filtering context. 

In addition, it is interesting to 'note that similar ideas have been proposed 
for large-scale systems in which measurements on a particular subsystem are 
used to update only .those subsystems that are "near" to i-t as detemined by 
some measure of dynamic interaction i{see, for example, ,[D-201,205,208] for 
related results for problems .of freeway traffic control and estimation) . We 
will have more to say about this later. 

mtivated by the simplicity of the filter proposed by Habibi [D-22] and 
by the recursive local state-space model proposed by Roesser tD-110] , Bariry, 
et.al. [D— 148] have developed a class of .constrained filters. Specifically, 
they consider a noisy version of Roesser 's model (D.36) 
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(D.130) 


y(x,3) = C^v(i,]) + + V(x,d) (D.131) 

where w and V are whxte noxse processes, Thexr suboptimal estxmator xs then 
taken to be the optxmum estxmator of the form 
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(D.132) 


All of the recursxve estxmators that we have examxned up to thxs poxnt 
have had two thxngs xn common — they have xnvolved dxscrete 2-D space and 
have used recursxve random fxeld models. Recently Wong [D-172,187] reported 
on some work on 2-D contxnuous-space estxmatxon. Thxs theory xnvolves the 
development of a stochastxc calculus xn 2-D, and thxs in turn has led to a 
number of interesting theoretical results. We defer the discussion of thxs 
topic until later xn thxs subsection. 

At thxs time it xs worth mentioning that there has been work performed 
on recursive processing of fields that come from nonrecursxve models, 
^ecxfxcally, Jain and Angel [D— 32] have considered fields described by a 
nearest neighbor, interpolative equation 
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x(m,n) = [x(m,n+D +x(m,n-l)] 

' + a^CxCm+ljn) + x(m-l,n)] + w(m,n) (D.133) 

Fields of this type have been studied by several authors and were proposed 
by Woods [D-9] as the prototype of discrete, 2-D Markov fields. We will 
have more to say about the properties and other uses of these fields in a 
short while. For now, we concentrate on the estimation problem when we 
observe 


y(ra,n) = x(m,n) + v{m,n) (D.134) 

Following [D— 32] , let us consider the vector scan process — i.e. we pro- 
cess an entire line of data at a time. Define the resulting 1-D vector 

processes x , y , w , and v . For example 
m HI m m 


f x(m,l) 1 


X = 

m 


x(m,N) 


Then, one can write (D.134) , (D.135) as 


(D.135) 


m+1 


= Qx - 
m 


X 


m-1 


+ w 


m 


, (D.136) 


y = X + V 
m m m 


(D.137) 


where Q is a symmetric, tridiagonal, Toeplitz matrix 
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Examining the structure of (D.138), one is tempted to utilize the same 
type of circulant approximation as that used by Andrews and Hunt [D-81] in 
order to diagonalize the system efficiently with the aid of the FFT, However, 
as Jain and Angel point out, the diagonalizatxon of Q 


M'OM = diag(X, ,...,X ) (D.139) 

~ IN 

can be performed with the aid of the FFT without any approximation. Thus, 

if we define the transformed quantities "x , 3? » etc., where, for example 

mm 

X = M»x (D.140) 

m m 

^ we obtain a set of N decoupled estimation problems, indexed by j (which 
indexes the components of the transformed vectors) ; 

X., =Xx‘ -jT, +w (D.141) 

m+1,] 3 m ,3 m-1,] m,D 

= X + V 


y 

m,D 


(D.142) 
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Each of these problems can be solved using a second-order Kalman filter 
(see [D-32] for an alternative method of derivation) , and we obtain the 
efficient implementation illustrated in Figure D.13. Again, if one wishes 
to utilize ail of the data to estimate^ each pixel, we can implement the 
smoother by including a second bank of filters which sweeps the lines in the 
opposite direction (m runs from N to 1) . One can also implement a one step 
smoother — which estimates based on data through line m+1. This requires 
only one back of filters, as in Figure D.13. We refer the reader to 
[D-32] for details. 

The approach in [D-32] deserves some comment. Again as in Attasi*s 

work, we have seen that transforming variables in one dimension and processing 

24 

in the other can lead to extremely efficient processing schemes. Just as 
with the block circulant approach of Andrews and Hunt, the spatial stationarity 
of the 1-D equation (D.136) is such that the FFT can be used to great advantage. 
This observation leads one to seek other formulations that possess structure 
that can be exploited in this manner. Jain and Angel mention -several other 
random field models that lead to symmetric, tridiagonal, Toeplitz evolution 
equations when scanned line by line, and in [D-30] Jam uses similar analysis 
for the efficient recursive filtering of one of these models, the so-called 
semicausal model; 

x(m,n) = a^[x(m-l,n) + x{m+l,n)] 

“Pa^ [x(m+l,n-D +x(m-l,n-l)] (D.143) 

+ px(m,n-l) + w(m,n) 

24 ' ~ 

Recall that the use of a transform in one direction followed by linear predic- 
tion in the other was proposed as an image coding scheme by Habibi and Robinson 
[D-37] . 
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The model was given this name since x(m,n) depends only on x(i,]) with ]<n 
(note, however, that (D.143) is not recursive) . We note that throughout 
the development in [D-30,32] it is assumed that no blurring occurs. It is 
not clear if the approach adopted in these references can be extended to 
include the effect of a PSF, but the efficiency of the algorithms developed 
by Jain and Angel indicates that it is certainly worth trying to find such 
an extension. As we shall see, the use of structure in this manner can be 
applied in a number of different settings. 

We have now surveyed a number of nonrecursive and recursive estimation 
methods, and the techniques discussed to this point deserve some comment. 

The recursive techniques come with many of the same criticisms that were made 
concerning nonrecursive filters. They require detailed models of the image 
statistics and image formation process, and they are essentially based on the 
MMSE criterion. Hence, they in general, will sacrifice resolution in favor 
of noise suppression. In addition, these recursive techniques necessarily 

t 

affect the image because of the assumed model structure. The effect of this 
in some cases (such as in t±e Kalman filter based on a stationary approxima- 
tion to the scanned image) may require additional processing (of the trans- 
posed image, for example) , while in other cases, such as the 2 -d causal 
models of Woods- Radewan and Attasi or the noncausal models of Jain and Angel, 
the effects may not be so noticeable. We have seen that some of the recursive 
techniques allow the inclusion of image blur, while in other cases the 
extensions to include blur have yet to be developed. Also, we have seen that 
in some cases optimal Kalman filtering is extremely complex, and suboptimal. 
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but intuitively appealing, recursive filter structures must be used. In other 
cases — specifically in the work of Attasi and Angel and Jam — we have 
observed that the use of the structure of the assumed model can lead to ex- 
tremely efficient optimal estimation algorithms. In addition, although 
work in this area has been limited in extent [D-24,174] , the recursive tech- 
niques are directly amenable to the analysis of space-varying and nonstationary 
models. Thus, in spite of the many qualifications, we find enough positive 
attributes to warrant continued study of recursive techniques for image 
restoration. 

Let us now comment and speculate on several aspects of image processing 
that we have only mentioned in passing previously. First of all, we have 
the problem of nonlinearities in image sensing. Consider first the multi- 
plicative noise model (D.63) , As discussed in [D-82,E-2] and in Section E, 
one can often filter signals corrupted by multiplicative noise by first 
taking the logarithm, then filtering wrth a linear system, and then exponen- 
^ tiating. This process — an example of homomorphic filtering — is described 
in Section E. We note here that this technique has been applied with great 
success [D-82 ,e- 2] , and in [D-82] it is argued that this, type of processing 
IS extremely compatible with the response characteristics of the human visual 
system. 

Equation (D.64) illustrates another kind of measurement nonlinearity, 
in which the noise is additive but the signal is distorted in a nonlinear 
fashion. Hunt [D-4,81] has studied such image processing problems in the 
context of nonrecursive restoration techniques. Specifically, he has devised 
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an iterative scheme for computing the maximum a posteriori image estimate 
given the observations. In the case of linear measurements, this reduces to 
the Wiener filter, The analog of this technique for the recursive methods 
IS the extended Kalman filter (EKP) , which essentially involves a continual 
relinearization about the present best estimate. This method can readily 
be derived for all of the recursive methods discussed. The interested reader 
IS referred to [D-21] for a discussion of this method in the context of the 
Nahi-Assefi scalar-scan recursive technique. There are, of course, many 
other nonlinear 1-D recursive estimation techniques besides the EKF, and 
most of these can be applied in this framework. For an example of one other 
such technique (again applied to the Nahi-Assefi method) , we refer the reader 
to [D-751. 

Another issue that we have mentioned on several occasions is the incor- 
poration of constraints, such as the positivity of the image estimate, into 

the estimation procedure. As mentioned earlier (see footnote 21) in many 

, . ' 25 

cases we needn't worry about this constraint explicitly. However, it is 

worth understanding the implications of such constraints. Andrews and Hunt 

[D-81] consider the constrained least squares formulation together with the 

additional positivity constraint. In this case there is no closed— form 

solution, and iterative nonlinear programming methods must be used. 

Mascarenhas and Pratt [D-23] also consider the incorporation of upper bounds 

on pixel intensities in order to inprove the conditioning of the restoration 

problem, and similar types of bounds on the pixels and on the values of the 


25 

And for homomorphic techniques we have no reason to worry at all, since 
exponentiation at the end guarantees positivity. 
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PSF (assumed unknown in this case) were considered by MacAdam [D-39], in 
the case of recursive techniques, one can also include positivity constraints. 
In [D-239] Jain discusses a recursive, iterative method for incorporating this 
constraint into the Nahx— Assefi model. Thus, we see that constraints such as 
these can be incorporated into the methods discussed previously. The cost 
IS a great increase in computational complexity, and it is not clear that it 
is worth the trouble. 

A third problem area with many of the restoration techniques is in the 
reliance on a prion information. As mentioned earlier, one often can assume 
knowledge of the PSP or can determine it by observing known test scenes 
through the imaging system. In other cases ,j we may not have such information 
and must estimate the PSF as well as the image. Based on the asstmiption that 
the extent of the PSF is far less than that of the image, Stockham, et.al. 

[E-4] suggest a "blind homomorphic deconvolution" procedure, -in -which one 
breaks the received image into pieces, takes 2-D transforms and the logarithm 
of the transforms, and then averages, over the various pieces. This, combined 
with the specification of a prototype transform (corresponding to the average 

, I 

of the logarithm of the transform of the original image) allows one to 
estimate the PSF and the other parameters needed for the geometric mean filter 
described earlier. We refer the reader to [E-4] for details. 

The question of parameter uncertainty is clearly of major importance for 
the various recursive techniques, all of which require a great deal of a 
priori information. Thus one important question concerns the robustness of 
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these techniques in the face of rnodelling errors. As mentioned in Section C, 
techniques do exist for the sensitivity analysis of 1-D state-space models 
and i-D Kalman filters (see [A-65,c-23] ) . Can we extend these methods to the 
2 -d case, and how well do the 2 -d algorithms perform? Is there any way to 
make them more robust? In addition, methods abound in 1— D for on— line para- 
meter identification and adaptive estimation in the presence of unknown para.- 
meters (see the various techniques described in Section B) • Can we apply 
these methods with any -success to the 2— D problem? The successes of such 
methods in 1-D and the several appealing features of 2-D rec\irsive estimation 
techniques make these worthwhile questions for future research. 

A final area of concern is the resolution-noise suppression tradeoff. 

As mentioned earlier, the human visual system is willing to accept more noise 
in certain regions, such as edges, in order to improve resolution. Thus, in 
relatively slowly varying regions of the image ^ we would like to remove noise, 
while where there are abrupt scene changes- or other high frequency fluctuations 
of interest, we would prefer to forego noise suppression in favor of resolution. 
Backus and Gilbert [D-78] (see also [D-19]) have devised a nonrecursive tech- 
nique for taking this tradeoff into account. They define a quantitative 
measure of the blur induced in the image by filtering. Then for any given 
value of this measure, one can determine the restoration scheme that minimizes 
the effects of noise subject to this constraint. We refer the reader to 
[D-19, 78] for details (see also [D-79]), Anderson and Netravali [D-99] have 
developed another nonrecursive approach involving a performance index that 
provides a tradeoff between blur introduced by the filter and the level of 
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noxse suppressxon. Thexr crxterion utxlxzes the results of certaxn psycho- 
visual ejsperxments that were desxgned to measure the relative xmportance of 
a unxt of noxse xn hxgh and low contrast conditions, but the evidence xs stxll 
inconclusive as to whether or not a standard measure can be obtaxned.for a large 
class of images. In addition to ^these methods, we refer the reader to 
[D-38,44 ,225] for discussions of several other nonrecursive image enhancement 
techniques. 

In the context of simultaneous image enhancement and noise suppression, 
an important problem involves the detection of edges or boundaries between 
different regions in an image. Within each of these regions one may be able 
to utilize one of the restoration techniques developed earlier, and in this 
manner we can suppress noise while preserving the resolution of the boundaries. 
We also note that in many applications the determination of the boundaries 
themselves may be the key issue [D-175] . in recent years a variety of tech- 
niques have been developed for detecting and recognizing various types of 
boundaries in 2 -d data. Many of these methods are based on pattern recognition 
techniques [D-243] , and we will not discuss them here. We simply refer the 

4 

reader to several references on this subject, [D- 15, 210, 240] , 

In 1-D, a variety of recursive techniques have been developed for estima- 
tion and detection of abrupt changes in signals (see [B-103] for a survey of 
many of these) . These techniques have been successfully applied in a wide 
variety of applications, including automatic detection of cardiac arrhythmias 
[B-104] and the detection of sensor and actuator failures tB-l03] . An inportant 
question then is the extension of methods such as these to the detection of 
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boundarxes xn xmages. To a large extent thxs remaxns an open p3x>blem, but 
there has been some work along these Ixnes. Specxfxcally , Nahx and Habxbx 
[d- 25] consxdered the problem of the detectxon of an object superxraposed on 
a background scheme. Thexr approach xnvolved the modxfxcatxon of the methods 
of Nahx [D-18,21,58] and of Habxbi [D-22] to xncorporate a bxnary varxable 
that xndxcates whether a partxcular pxxel xs xn the object or xn the back— 
groimd. The scheme devised in [D-25] involves the recursive calculation of 
likelihood ratios for the existence of botmdaries, and it also incorporates 
the use of a bank of two filters {based on object and background statistics, 
respectively) for the suppression of noise once the boundaries have been 
determined. In [D-175] Nahi and IiOpez— Mora were primarily concerned with the 
estimation of the boundary. Here, the 1-D Mcirkov scan model of [D-18,21,58] 
is augmented to include several states used to model the boundary. As the 
resulting model is nonlinear, a nonlinear estimation scheme is employed, and 
some promising results are presented in [D-175] , These results notwithstanding, 
a great deal of work remains to be done in the development of recursive methods 
for the detection of boundaries in images. It is our feeling that this may 
prove to be one of the most important uses of 2— D recursive estimation techniques 
We now turn our attention to the detailed analysis of statistical and 
probabilistic models for random fields. Applications for such techniques 
extend far beyond image processing into fields such as seismic signal processing 
[D-68, 70, 199, 209, 216, 227, 245] , gravity mapping [D-1,211,212,224] , meteorology 
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and atmospherxc modelling, [ID-69,214,231], biomedical imagery and image 
reconstruction [D-11,29,213,223] , modelling of scattering fields [D^222,232— 
235], modelling of the distribution of earth resources [D-15 ,61,246] , analysis 
and modelling of turbulence [D-217] , and the modelling and analysis of random 
transport and wave propagation phenomena [D-170,189,193, 194,215,220,226] . 

With such a wide variety of potential applications, there clearly is a need 
for a general methodology for the analysis of random fields. Much has been 
done in this direction, but, as with all multidimensional topics, much remains 
to be done. We will describe some of the work that has been done, will touch 
on several of the applications mentioned above, and will speculate on some 
open questions. 

Motivated to a great extent by their utility in 1-D, many researchers 

have investigated the extension of the concept of a Markov process to several 

dimensions. Perhaps the first of these was developed by Levy for continuous 

parameter spaces [D-12,197,198,230] . The situation in two dimensions is 

depicted in Figure D.14, Suppose we have a 2-D random field f(x,y). Then f 

26 

IS called Markov of degree p if it essentially has the following property: 
let 3 g be any smooth closed curve encircling the origin and separating the 
plane into the "past" (s^) , the "present" (9 g) , and the "future" ; then, 
given f and its first p-1 derivatives at the present, the values of f in the 
future are independent of the values of f in the past. The field f is called 
Markov if it is Markov of degree 1. This definition is quite intuitive, and 
one can imagine fields in a variety of physical situations that have this type 

26 ; 

We say "essentially" here since f may not be differentiable. For the tech- 
nically precise definition, we refer the reader to the references. 
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of radial causality. 

Levy also defined a multidimensional Brownian motion process x(t) , 
t = (t^,...,t^), which is a Gaussian process with statistics 

E[x(a)l = 0 (D.144) 

E[x(a)x(b)] = j ([al + [b| - [b-a]) (0.145) 

(here | • 1 is the usual Euclidean distance) . McKean [D-198] showed that for 
d odd, x(t) IS Markovian of degree (d+l)/2, and for d even, x(t) has no^ 
Markovian property. Since Brownian motion and its Markovian properties 
proved to be so useful in developing 1— D tools of stochastic analysis, the, 
above result is disappointing. This disappointment is in fact compounded by 
the analysis of Wong [D-12] who showed that there are essentially no contin- 
uous Gaussian random fields in two or more dimensions that are simultaneously 
stationary, isotropic (the covariance function is invariant if we rotate the 
coordinates of the parameter space) , and Markov (of degree one) . Thus it is 
evident that this setting will not lead to a useful multidimensional stochastic 
calculus for the study of random fields. To do this, we must turn to a 
recursive formulation, and we shall do this shortly. 

It IS interesting to note that the analog of Levy's notion for discrete 
space systems, as developed by Woods in [D-9] , leads to far more useful results. 

Stationary Gaussian fields of this type can be generated by interpolative 

27 

filters of the form 


Such models have been considered by several authors including Whittle [D-61] 
^d Larimore and Beavers [D-1] . 
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x(n.,m) = h(k, Wx(n-k,m-Sr) + u(n,m) (D.146) 

D 

P 


where u(n,m) xs statxonary, and 


= { {k , £) I k^+X<^£P^ , 


k,£ not both o} 


(D.147) 
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(p.149) 


Thus, we see that the drivxng noxse xn thxs case xs not'whxte but xs 
fxnitely correlated. 

We have already "seen xn the work of Jaxn and Angel [D-32] that xnter- 
polatxve models can be used for effxcxent 'recursxve estxmatxon of random 
fxelds. Such models also have several other uses. One possxbxixty xs xn 
the area of spectral estxmatxon. In [D-237] Woods proposes the fxttxng of 
obseirved correlatxon data to an xnterpolatxve Markov model. In thxs case 
one agaxn obtaxns a set of normal equatxons for the coeffxcxents of the 
model that yxelds the mxnxraal xnterpolatxon error xn a least squares sense. 
Unfortunately, as Woods poxnts out, these equatxons cannot be xnverted ef- 
fxcxently as xn the 1**D Ixnear predxctxon case, and Woods tD-237] proposes 
a complex algorxthra for obtaxnxng the desxred spectral estxmate. 
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Fortunately, as we have seen in [D-32] , necirest neighbor models have a 
great deal of structure that can be exploited to obtain efficient computa- 
tional schemes, in [D-31] Jain proposes a nearest neighbor interpolative 
filter for imaging coding. Basically, Jain assumes a separable, stationary, 
isotropic model for the image 

E[x(n,m)x(0,0)3 = ' (D.150) 

and in this case he finds the optimum first order (p=l in (D.147)) inter- 
polative error filter. In this simple case, one can solve the normal equa- 
tions by inspection. Having this filter, one can consider a coding scheme 
in which we transmit only the interpolation error. Thus, the decoder must 
essentially -solve the interpolative, and" hence nonrecursive, equation. In 
general, this is a difficult task in its own right. However, one can use 
techniques analogous to tiiose in [D-32] to perform the reconstruction ef- 
.ficiently. That is, we can consider reconstructing the image line by line, 
and the resulting vector eqpiations display the same type of tridiagonal 
structure that was exploited earlier in the development of an efficient res- 
toration scheme. Similarly in this case we can also use FFT algorithms for 
efficient reconstruction. In addition, as discussed in [D-241,242], the use 
of interpolative models leads to efficient Karhunen-Loeve transform coding 
using the FFT, 

Thus, we have seen that interpolative models have a nvunber of appealing 
properties. They also have their drawbacks, such as in efficient spectral 
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estxmation, and one xs naturally led to seek other models and statxstxcal 
methods for fxttxng 2-D data to parametrxc forms for such models. One xm- 
medxate generalxzatxon from 1-D that we have mentxoned before xs the develop- 
ment of 2 -d Ixnear predxction technxques — x.e. the xdentxfxcatxon of 2-D, 
causal, autoregress xve models by means of least squares predxctxve error 
fxlter desxgn. Immedxately we see that one problem that arxses xs the choxce 
of the dxrectxon of recursion for the AR model — x.e. which elements of the 
fxeld wxll be used to predict which other elements. Another problem xs the 
stability of the resulting filter, which is guaranteed in 1-D but not in 2 -d, 
as Genxn and Kamp have pointed out [D-145] (see also the work of Marzetta 
[D-66,67]). In addition, even if stability is not a problem, one faces the 
question of finding efficient algorithms for the solution of the normal equa- 
tions that specify the filter parameters — i.e., is there a fast 2 -d Levinson 
algorithm In [D-145] Genin and Kamp develop sets of recurrence relations 
for 2-D orthogonal polynomials. Can these relations be used to devise fast 
algorithms as they can in 1-D (see Section B>? 

The above questions remain open in general, but recently Marzetta [D-67] 
developed a fast algorithm for 2-D linear prediction that involves the use of 

1- D techniques and the same 2-D, scalar/l-D, vector interplay that we have seen 

before. Consider the situation depicted in Figure D.15a. We have a stationary 

2- d field x(k,5>), and we wish to predict x(m,n) based on the array of x(i,j) 
to the SW that are indicated in the figure. We do this in two steps. First, 


See also [D-62] for another difficulty that arises wx.th such dxscrete-time , 
nonrecursive, 2-D Markov models. ' 

29 

A related question, given the perspective of Section B, xs the existence of 
fast algorithms for the calculation of the gains of recursive 2 -d Kalman 
filters. ' 
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(b) Does a Fast Algotirhm Like This Exist'? 

Figure D.15; Known and Conjectured Fast Algorithms for 2 -d Linear Prediction 
and Interpolation. 
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regarding each colTomn as a 1-D vector, we use the preceding col\mms to predict 
the mth column. This can be done with standard fast algorithms for vector 1-D 
linear prediction. In fact, in this case we can effectively use the faster 
scalar 1-D algorithm since the block Toeplitz matrix to be inverted is in 
fact Toeplitz because of vertical stationarity. Having coirpleted this step, 
we compute the prediction errors in the last column' and use these to predict 
the error at (m,n) by performing a scalar 1-D prediction to the North. This 
algorithm bears a striking resernblance in style to that of Attasi (see 
Figure D.12) . The only difference is that Attasi uses two 1-D Kalman filters 
— one North, one South — to perform a smoothing along the last column. This 
observation leads us to speculate on the existence of a fast algorithm for 
linear interpolation for the semicausal [D-30] structure illustrated in Figure 
D.lSb. 

Identification of parametric 2 -d models has attracted the attention of 
several statisticians over the years [D~l,59,6l], and several of their results 
are definitely worth noting. Whittle [D-61,63] was one of the first researchers 
to consider the properties of 2-D stationary processes. One of the topics 
he considered was the "unilateral" representation of a 2-D process, which is 
simply a half-plane recursive representation of a given field. Using a method 
exactly along the lines developed by Dudgeon [D-33,102] , Ekstrom and Woods 
[D-119] , and Marzetta [D-66] , Whittle obtained an in general infinite order 
representation of this type by factoring the 2-D power spectral density of 
the process. In addition, in [d- 61] Whittle also rerates various recursive 
and nonrecursive autoregressive discrete- space models to analogous stochastic 
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partial differential equations. Such equations were examined by Heine [D-60] , 
who examined the properties of linear stochastic equations of the parabolic, 
elliptic, and hyperbolic forms. Whittle noted that the nearest neighbor model 
corresponds to an elliptic equation for which Heine showed that the correlation 
function tcikes the form of a modified Bessel function of the second kind. 
Whittle then uses this fact to argue that in the discrete space -case, such 
correlation function forms are preferable to decaying exponentials. 

In addition to considering these issues. Whittle also discussed the 
maximum likelihood and least squares estimation of the parameters of a 2-D 
autoregressive model. This subject is also considered in far greater detail 
by Larimore and Beavers [D-1] , and the results bring to light a rather im- 
portant point. In 1-D, assuming Gaussian statistics, finding the maximum 
likelihood parameter estimates is equivalent to finding the parameters of an 
inverse filter that yields the least squares prediction error -- i.e., the 
log-likelihood ratio is up to an additive constant, proportional to the nega- 
tive of the sum of squared estimation errors. In the 2 -d problem, this is not 
the case if the field model is not causal. This is due to the fact that in 
this case the Jacobean of the transformation from prediction errors to the 
field is not unity and is, in general, a rather complicated function of the 
parameters. This greatly complicates parameter and spectral estimation, as 
we already noted in discussing the work of Woods [D-237] , We refer the reader 
to [D-l] for details of the problem of 2-D parametric model identification and 
for the consideration of other problems, such as the design of a 1-D shaping 
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filter for the part of a 2 -d field observed by a point tracing a path in 

the plane, liiis problem is of great practical value in problems such as 

30 

accurate inertial navigation and gravity field estimation [D-1,193,194] 

As discussed earlier, in l-D the use of stochastic calculus greatly 
facilitates the analysis of continuous -time random processes. It seems na- 
tural, then to attempt to extend concepts such as the Markov property. 

Brownian motion, and stochastic calculus to 2-D. We have seen, however, 
that the intuitively appealing approach of Levy does not provide a useful 
framework, and the reason for this is the lack of causality in this framework. 
Specifically, in 1-D the basic tools of analysis of Brownian motion, Poisson 
processes, stochastic differential equations, etc., essentially are based on 

31 

the principles of martingale theory (see, for example, [D-247]). Simply put, 
a martingale M(t) is a 1-D random process such that for t>s the best estimate 
M(t) given M(t) , T<s is M(s) ; 

E[M(t)|M(T), T<s] = M(s) (D.151) 

To extend these notions to 2 -d, we immediately run into a problem: what 

does t>s mean? That is, we must be able to specify at least a partial order 

30 ^ 

The problem of modelling random perturbations in gravitational fields has been 
considered by a number of authors [D-211,212 ,224] . A common approach to this 
problem is the use of the spatial transform most appropriate for such problems — 
spherical harmonics. Wong [D-12,230] has considered such transform methods in 
the general setting of isotropic random fields on spaces with constant curvature, 
The use of geometric concepts such as spherical harmonics greatly facilitates 
the analysis of random fields. We also refer the reader to the work of 
Swerling [D-10] , in which many of the statistical properties of random contours 
are discussed at some length. 

31 

The following discussion is greatly oversimplified, and we refer the reader 
to the references for the full story. 
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on the plane, and, as discussed in Subsection D.l, this can be done if we 
impose a causal structure on the processes considered. 

In recent years 2 -d martingales (and higher dimensional generalizations) 
with a NE causal structure have been investigated by a number of authors 
[D-172, 176,179-187,191,192,196], Basically, we consider processes 
defined on the NE quadrant, on which we place the partial order 

(z^,Z2)X?1,52) <=> 1=1,2, (D.152) 

Then M(z^,z^) is a (NE) martingale if whenever 

E(M(z^,Z2> |m(s^,S2) , (D.153) 

The relevant geometry is depicted in Figure D.16. Here^i^^r^2^ denotes 
the set of all M(s^,S 2 ) with * 

Having this framework one can then begin to develop all of the tools for 

I 

a usable 2-D stochastic calculus. The results obtained indicate that such 

a calculus can be developed, but it is not without its surprises and 
32 

limitations. One of the major surprises is that given a NE Martingale, 
the lack of a total order leads directly to the construction of a second mar- 
tingale, and, in fact this second martingale, which in some sense involves 
products of the original martingale at unordered points, is essential to the 
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The same comment can, 
in 2 -d system analysis. 


of course, be made with regard to ^ust about any topic 
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development of a full set of stochastic differentiation rules. In addition, 
one of the ma^or limitations of this approach appears to be the restriction 
to quadrant causality. But is this really a restriction? In 1-D one of the 
roost important dynamic models involves the representation of a random process 
as the output of a causal stochastic differential equation driven by a mar- 
tingale. Perhaps in 2-D we must break the process into two parts, one driven 
by a NE martingale and one by a SE martingale. Recalling Ekstrom and Woods 
assertion [D-119] that any power spectral density can be created by using 
white noise to drive one half-plane or two quadrant filters, this idea may 
not be that far-fetched. 

In any event, there certainly appear to be enough reasons to pursue the 
utility of such a continuous parameter 2-D stochastic calculus. In 1-D one 
often finds that the continuous -time solution is far simpler computationally 
and conceptually than the corresponding discrete-time solution and, in fact, 
for digital systems one often solves the continuous problem and discretizes 
rather than discretizing the problem at the start. Examination of the 
recursive 2-D optimal estimation and detection results derived by Wong in the 
continuous case [D-172,176,179,187] and comparison of them to the analogous 
discrete-time results discussed earlier in this section, we see that the same 
may be true here. In addition to applications such as these, it appears that 
a 2 -d stochastic calculus may be of use in the analysis of processes that 
evolve in both space and time, which is the next topic of discussion. It is 
our feeling that the preceding remarks and the following development provide 
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In [D-184] It is argued that this second martingale arises naturally from the 
deterministic rules involving Stielt^es differentials on the plane. 
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ample motivation for the continued study of 2-D stochastic calculus. 

Throughout this subsection we have seen numerous examples of 2-D signal 
processing problems in which good use 'is made of the transfomation of the 
signals obtained by considering them to be 1-D vector time signals, in which 
the other independent spatial variable is used to index components of the 
vectors. We now will briefly examine several problems in which the processes 
are tmily of this from — i.e, they are space-time processes — or at least 
in which one can benefit by viewing multivariable 1-D systems as systems 
with two independent variables. 

One of the best examples of space— time processes arises in the conside- 
ration of seismic signal processing (see [D-68, 70, 199, 209, 215, 216, 227, 245] ) , 
in which we observe the response of the earth to excitation through an array 
of sensors. In such a system the sensors receive signals due to reflections 
from different layers in the earth. In addition, there is often coherent 
noise, resulting from various types of waves, and there also is incoherent 
noise. Hence, we obtain a 2-D signal y(j,t), where t is time and j denotes 
the ]th sensor (here j can be thought of as a measure of distance from the 
sensor to the location of the original excitation) . If S (t) denotes the 
response of the earth to the excitation, we can model y(7,t) as follows [D-68] 

y(lft) = S(t-T^) + N(t-6^) + w(],t) (D.154) 

where and 5^ are the time delays incurred by the earth response ^and the 
coherent noise, respectively, in travelling to the ]th sensor. Also,w(],t) 
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is the incoherent noise. Given this 2 -d signal, we want to estimate S(t) 
and the time delay (called "moveouts") . 

A number of solutions have been developed for this problem. In the 
context of 2 -d signal processing, if we assume constant but different, speeds 
of propagation for S and N — i.e. 

d d 

T = -1 , 6 = -1 (D.155) 

3 3 • 


where d^ is the distance to the 3 th sensor, we can use "fan" filters to 
discriminate between these signals- Basically, if we consider the 2— D 
Fourier transform of these space time signals (let us assume for simplicity 
that we have a continuum of sensors) 


Y(tO^,W2) 


// 


y(x,t) 


-301 x-u t 
e dxdt 


(D.156) 


then the point corresponds to a plane wave traveling with velocity 

= slope of the line” connecting this point to the origin. Hence all of 

the velocities within a given range are obtained by points in a sector in 
-space, and thus if we design a filter to pass only the frequencies 

in the appropriate sector, we can achieve the desired velocity discrimination 
(see Figure D,17), 

In addition to this type of approach, one can consider the design of 
optimal filters for the estimation of S and the x^. 


In (D-68] Sengbush and 
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Foster derive the optimal nonrecursive Wiener filter for this problem and 
analyze its properties as a 2-D filtering system. We refer the reader to 
[D-68] for the details of this development and for a discussion of other 2 -d 
nonrecursive techniques . 

An interesting question involves the development of recursive estimation 
techniques for problems such as these. Such algorithms may be particularly 
useful given the apparent need for using space-varying models [D-68] . We 
will discuss the problem of recursive techniques very shortly. 

Another class of space-time problems is essentially 3-D. This involves 
the observation of a sequence of 2 -d images in order to determine motion or 
scene changes. Such problems arise in meteorological problems such as the 
tracking of cloud motion [D-69,23l] . In addition, if one is performing 
image processing on a sequence of images, one might expect that the use of 
temporal as well as spatial correlations would improve overall processor 
performance. The development of systematic recursive or nonrecursive ap- 
proaches to problems such as these is an appealing area for future work. 

A final area in which one finds space-time processes is in the conside- 
ration of random vector (transport) or force fields which affect the motion 
of particles or waves. Applications for models such as these aboTind. How 
does the statistical description of a random gravitational field affect the 
motion of a satellite [D-224] , and by observing the motion of the satellite, 
how can we obtain better estimates of the gravitational field? Given a 
statistical description of wind currents, predict the space-time distribution 
of pollutants coming from some source, and determine the optimal locations for 
the placement of pollution sensors. 
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Techniques exxst for all of the problems mentioned above, but at this 
time there is no systematic theory for the probabilistic analysis and recursive 
estimation of general space-time stochastic processes, although major steps 
have been taken in this direction for space-time point processes lD-11,223] , 
and some work has been done towards developing a calculus for isotropic 
random vector fields [D-226] , In addition, motivated by several of these 
applications, Kam and Millsky [D-193-195] and Washburn [D-196] have attempted 
to utilize the tools of 1-D and 2-D stochastic calculus in order to develop 
recursive techniques for space-time processes. We briefly describe several 
of these results. 

The results in [D-193-195] are basically separable in nature, in -that 
1-D stochastic models are developed separately for the spatial and temporal 
variations. Motivated by time delay problems such as those that arise in 
seismic signal processing, we have considered the following problem; a 
source at spatial location s=0 transmits a random signal (J>(t), t^O. This 
signal is modeled as the output of a possibly time-varying linear shaping 
filter 

x(t) = A(t)x(t) + w(t) (D.157) 

(j)(t) = C'(t)x(t) (D.158) 

The signal is then propagated in the positive s direction by a random 
velocity field v{s) with given statistics. At points s^,...,s^ we have 
sensors which measure delayed versions-of the signal ^ * 
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y{i,t) = + V(i,t) 


s 



(D.159) 


(D.160) 


'Given this problem formulation we consider the problem of recursive optimal 

34 

estimation of (|) and of the . This is an extremely difficult problem, 

and implementable solutions have been found only in certain special cases. 
However, the work in [D-193, 194] represents a useful first step in the 
development of such techniques, and the results obtained can be used to devise 
sviboptimal recursive scheites. Work continues along these lines. 

Another problem considered in [D-193, 195] has certain aspects in common 
with problems considered in [D-1,24] . Specifically, we have a random field 
and a point sensor that traces a 1-D track along the field. As considered 
in [D-1] , suppose we can model the spatial variations along this 1-D track 
by a spatial shaping filter 


x(s) = Ax(s) + w(s) {D4I6I) 

f(s) = Cx(s) (D.162) 


Let v(t) and s(t) denote the velocity and position of the point sensor as a 
fxmction of time. The time history of the observations of the point sensor 
may then be modeled by 

y(t) = f(s(t)) + v(t) (D.163) 


34 

We also allow the possibility of delayed versions of $ being transmitted from 
other locations. This can be used to model multiple reflections. 
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or, if we include the possibility of blurring 


y(t) 



h(t-T)f (s(T))dT + v(t) 


(D.164) 


Although only the case of (D'.163) was considered in [D-193,195], the 
analysis can be readily extended to the case of (D.164). This extension 
IS presently being developed. 

Given this formulation, one can ask several questions. For example 
one might wish to estimate the field f given these measurements. If the 
velocity history is known, this is not difficult, and this problem resembles 
that of lD-241 at least in spirit. If the velocity is unknown — i.e. 

i 

we have random motion blur — the problem is more complex. Methods are 
developed in [D-193,195] for the suboptimal solution of this problem. 

Note that in this case we have one more difficulty — the mapping problem. 

At any point in time we don't know which point s(t) we're looking at. Note 
also that intuitively in all of these problems the velocity v(t) must affect 
the accuracy of our obseirvations -- the faster we move, the less we observe. 
Thus, one can consider the problem of controlling the speed of the sensor 
in order to achieve certain performance specifications. An optimal control 
problem along these lines is considered in Ed-193,195], 

A third class of separable space-time problems, motivated by the random 
force field problem, is presently being studied. We have a 1-D random 
acceleration field a(s) which has a spatial shaping filter representation 
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x(s) = ftx(s) + w(s) (D.165) 

a(s) = Cx(s) (D.166) 

Suppose a partxcle is svib^ect to this field. The equations governing its 
motion are 

s(t) = v(t) (D.167) 

v(t) = a(s(t)) (D.168) 

We wish to estimate the shape of the random field from noisy observations 
of the position of the particle 

y(t) == s(t) + V(t) (D.169) 

Results for problems of this type will be forthcoming. 

Clearly all of these problems represent vast simplifications of real 

problems, but they also represent a start. One must now consider the ex- 

1 

tension of these ideas to several spatial dimensions and the use of non- 
separable space-time stochastic models. The use of a multidimensional 
stochastic calculus such as that described earlier is clerly essential. As 
an indication that at least in some cases the NE causal structure of this 
calculus may not be a problem and in fact may be natural, we mention an 
observation of Washbusoi [D-196] . Suppose we consider a space-time system 
with one spatial dimension, and suppose that because of fundamental limita- 
tions (due, for example, to the finite speed of light) events at any given 
spatial point can affect those at another only with a certain time delay. 

This leads to the usual "light cone" description of the future and past of 
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a gxven space-txme point, Assmiang we scale the axes appropriately, this 
cone can be assumed to have an angle of 90®, as indicated in Figure D.18. 
Hence, by rotating the coordinates by 45°, we obtain a NE causal structure. 
The utility of this observation when combined with 2 -d stochastic calculus 
and a variety of space~time analysis problems will be reported in the future 
[D-196] . 

As mentioned earlier, in addition to systems which truly have a space- 
time character, one can view any multivariable 1-D system as a 2 -d system 
by considering the "space" variable to be the index of the elements of the 
various vector functions of time. While this may not be particularly na- 
tural in general, this philosophy appears to have some merit for large 
scale systems \diich consist of a number of interconnected subsystems. In 
this case we let the spatial variable index subsystem variables which may be 
vector quantities themselves. A general linear model for such a system is 

x(k+l,i) = x(k,j) + u(k, 3 > + w(k,:) (D.170) 

J 13 " 13 

y(k,i) = C x(k, 3 ) + v(k,]) (D.171) 

3 

Clearly this is a recursive 2-D model. Examples of large-scale systems of 
this type abound in practice. Examples include power systems, communication 
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Figure D.18; 


The 2 -d Causality Structure for Space-Time Processes 
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networks, and freeway traffic systems. We refer the reader to [D-84-89, 

- 91-94, 190, 201, 203-208] for other exanples and for some insight into the 
problems associated with such systems. 

The problems with these systems are of two types: (1) the analysis of 

these systems using tools such as the Lyapunov equation and the determination 
of optimal filter and controller designs is far too complex to be carried 
out using standard methods because of the high dimensionality of the overall 
system; and (2) the implementation of standard controllers and estimators 
is out of the question, since these systems require totally centralized 
processing of all subsystem data in order to determine each subsystem control; 
what is needed is a decentralized scheme. 

We have seen similar questions in our study of recursive image pro- 
cessing techniques. The full-state optimal Kalman filter of Woods and 
Kadewan [D-173,229,236] was of enormous dimension, and one would never dream 
of attempting to solve the Riccati equation in this case. In addition, the 
Kalman filter update is far too complex, and, as we discussed. Woods 
and Radewan suggested a "nearest neighbor" constrained Kalman filter, in which 
only those pixels near the one presently being processed are themselves updated. 
This clearly is a decentralization of sorts, as are the techniques proposed 
by Murphy and Silverman '[D-174] and Pratt [D-17] . What these methods have in 
common is the following: we specify some constraints on information transfer 

35~ ^ ^ 

'In this last case, the subsystem index does represent a spatial variable, as 
each subsystem describes the aggregate behavior of traffic on a link of a freeway 
(see [D-208] ) . in this case, the ichoice of the size of each link is a type of 
sampling problem, and the issues of spatial sampling, such as those raised by 
Mascarenhas and Pratt [D-23] and Hunt [D-4] in the context of image processing, 
are clearly relevant here. 
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— i.e. we Ixmxt the extent of the update portion of the filter — and then 
we optimize the filter gains subject to these constraints, Ihis same philosophy 
IS precisely what is used in many decentralized control and estimation pro- 
blems [D-86,201,205] . That is, we specify some constraints on the information 
pattern — i.e. which data are available for each subsystem — and then we 
optimize the estimator and controller gains subject to these constraints. 

Thus, we*ve seen that large-scale systems can be viewed as 2-D systems 
and that constrained optimization for efficient or decentralized processing 
IS common in both settings . Is there any other insight can be gained or new 
results that can be obtained by examination of large scale systems as 2— D 
systems? The answer is perhaps, and we will relate some preliminary observa- 
tions that make us feel that the answer will ultimately be yes 

First of aJ.1, suppose that the model (D.170) , (D.171) falls into the class 
considered by Attasi [D-6,35,96], Then the optimal centralized Kalman filter 
IS nothing more than Attasi 's line by line optimal processor. In this context, 
let us re-examine the structure of this processor as pictured in Figure D.12, 

One may argue that this processor may^not be a good image restoration system^ 
but it certainly is an extrenely efficient centralized Kalman filter! The 
predict cycles for each subsystem are carried out in a totally decoupled fashion, 
and in the update stage, each subsystem need only communicate with its nearest 
neighbors (we have two streams of information flowing, corresponding to the two 
Kalman filters) . Whether optimal centralized controllers also have this 
structure remains an open question. 

The problem of choosing a good information pattern in the first place is an 
extremely important and complex one, but it is beyond the scope of our present 
discussion. We refer the reader to the references and in particular to [D— 85] , 
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As a second example, consider the case in which (D.170), (d. 171) are 
37 

spatially invariant 


x(k+l,i) = y)A x{k, 3 ) + u{k,j) + w(k,]) 

— 1— j i-j 

J 7 


(D.172) 


(k,i) = y^C^_^x(k,i) + v(k,]) 


(D.173) 


In the case of an infinite string of subsystems, no noise, and a spatially- 
.invariant quadratic cost function, Melzer and Kuo [D-203] determined an ef- 
ficient method for determining the optimal centralized controller, and Chu 
[D-205] used the same method to determine the optimal, constrained decentra- 
lized controller. The basic idea is identical to that used by Hunt [D-46] , 
Andrews and Hunt [D-81] , Jain and Angel ’[D-32] , and Attasi [D-6,35,96] — we 
take the z-transforms of (D.172), {D.173) in the subsystem variable to 
obtain a system of decoupled optimal control problems (parametrized by z) of 
dimension equal to that of each x^. 

To make these ideas more clear, let us consider the case in which we have 

a finite string of siibsystems [D-190] , i=0,...,N-l. In this case, if we 

rewrite (D.172) , (D.173) m terms of one giant state, input, and output vector, 

we find that the resulting A, B, and C matrices are block Toeplitz. As 

Andrews and Hunt [D-81] discuss, 'we then make the block circulant approximation 
38 

to obtain 


Such models arise, for example, in the longitudinal control of a string of 
vehicles [d- 2041 such as one finds in personal rapid transit systems. 

38 

Approximations such as these often arise in the discretization of partial dif- 
ferential equations such as the wave equation (see, for example, [E-31]). 
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N-1 N-1 

x(k+l,i) = A x(k,x-j) + yi B u(k,i-])- + w{k,i) (D,174) 

3=0 3 ^ 


N-1 

y(k,x) = y) c x(k,i- 3 ) + v(k,x) (D.175) 

- D=0 ^ 

wxth 


E[w(k,x)w' ( 3 ,£)] = S 6 . (D.176) 

1” j KXf 

E[v(k,x) v' ( 3 , 5 ,} 1 =0 6 . (D.177) 

X— 3 jcx. 

where all svibsystem indxces are to be xnterpreted modulo N. Suppose we wxsh 
to desxgn a controller to mxnxmxze the crxterxon 

! “ N-1 ) 

Yt Ix'(k,x)Q. x(k, 3 ) +u*(k,x)R u(k,3)l> (D.178) 

k=0 x,3=0 ) 


As dxscussed xn Appendxx 2, we take siibsystem.'. transforms. For exairple 


^ 0 

x{k,a> = Y x(k,x)w"^ (D.179) 

x=0 

We then obtain a set of decoupled problems, indexed by Z 

x(k+l,Jl) = A(£)'x(k,Jl) + B(£)u(k,)l) + w(k,Jl) 
y(k,J.) = c (^)x(k,£) + v(k,S-) 

E[w(k,il)w*( 3 ,m)] = S(A)6 6- 

KI3 ^ Itl 


(D,180) 

(D.181) 


(D.182) 
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EEv(k,il)v*( 3 ,in)] = (D.183) 

00 

Jfl = 'Q(A)x(k,A) + u*(k,£) ’R(£)u(k,il)] (D.184) 

k=0 

Here * = complex conjugate. Note also that since all original variables are 
real, we have A(-5-) = A*(Jl), etc. Thus, we need only solve approximately one- 
half 'of these problems to obtain the optimal centralized controller which is 
efficiently implemented in Figure D.19, The reader is urged to compare this 
figure with Jain and Angle’s optimal image restoration scheme as depicted in 
Figure D,13, The similarity here is rather striking^ as is the similarity 
in method and philosophy underlyi ' both systems. Work involving the system 
(D.174), (D.175) is continuing. We are examining such issues as the effects 
of the block circulant approximation, the use of this method for fast algorithms 
for Lyapunov equations, Riccati equations, pole placement, etc., and the design 
of decentralized controllers. Note that one possible decentralization can be 
obtained by spatially windowing the optimal centralized filter and control 
gains. As the properties of various windows are well-known (see, for example, 
[C-1] ) , it may be possible to obtain detailed performance evaluations for 
such schemes. 

Thus, we have seen that there are points of contact between 2— D processing 
concepts and large-scale 1-D system analysis. Whether these^ points will lead 
tO‘ major new results or exciting concepts remains to be seen, but there certainl;; 
appear to be some intriguing possibilities. 




I 

M 

a^ 

-j 


Figure D,19 ; Illustrating the Optimal Citculant Feedback Systems 
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E, Some Issues in Nonlinear System Analysis; Homomorphic Filtering, 

Bilinear Systems, and Algebraic System Theory 

Most of the discussion to this point has dealt with the analysis and syn- 
thesis of linear systems, perhaps distorted by nonlinear effects such as 
quantization. However, there has been much work on the analysis and design 
of systems which are fundamentally nonlinear in both digital signal processing 
and in control and estimation theory. It is beyond the scope of this paper 
to consider the research in this area at any depth, and we refer the reader to 
the references and to the literature in the two disciplines for the full story. 

In this section we limit ourselves 'to a brief look at two particular directions 
of research that have a common thread involving the use of algebraic concepts 
to study nonlinear systems possessing particular types of structure. The phi- 
losophy underlying these results is that many of the concepts and techniques , 
from linear system theory can be carried over to the analysis of certain non- 
linear systems. Not only is this of use in allowing one to solve certain non- 
linear problems, but it is also of value in providing insight into the properties 
of linear systems — i.e. one gets a clearer picture of which system properties 
carry over to nonlinear systems with particular structure and which properties 
are fundamentally tied to linearity. 

In digital signal processing, Oppenheim [C-1,E-1,2] abstracted the key 
concept in linear system analysis — superposition — and developed what he 
termed homomorphic signal processing . Following [C-1] , the basic idea is as 
follows. Let X and Y be spaces with two operations 
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defined on each — a binary operation 


yiry2£Y yi°Y2^^ 

and an operation of scalar action 


(E.l) 


cER or C, xEX c*xEX 

cER or C, yEY c-y£Y 


(E.2) 


A homomorphism is then a map H from X to Y which preserves these operations ~ 
i.e. it satisfies a "generalized superposition principle” 


HCx^itx^) = H{x^)oh{x2) 
H{c«x^) = c-H(x^) 


(E.3) 


If the operations (E.l), (E.2) satisfy the axioms of a vector space (e.g. this 
means that all the operations are commutative), then (E-3) looks very much like 
a linear system. In fact, one can show in this case [E-1] that any such system 
can be represented as the cascade of three homomorphic systems 

H = D~^OLoD (E.4) 

y X 

where L is a standard linear system, and D and D are 'called characteristic 

X y 

systems. They translate the operations in X and Y into usual vector addition 
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and scalar multxplication. 

Let us take a look at an example of this. Let X be the space of input 
sequences in which each input is strictly positive. We make X into a vector 
space with the operations 


[(a.x ) * (6.x ) ] (n) = X (n)‘^x (n)^ (E.5) 

12 1 ^ 

and the system is clearly seen to be the map 

x(n) ►log[x(n)] (E.6) 

with the inverse 

5(n) (E.7) 

One can similarly define vector space operations in which X consists of 
all nonzero complex numbers or all of those of modulus one i[e-l,E-l3 , although 
there are some difficulties due to the inonunic[ueness of the complex logarithm. 
We will not go into these here and refer the reader to [E-1,C-1] . 

Having this framework, one can consider the filtering of signals corrupted 
by multiplicative effects. That is, suppose we observe 

z(n) = x(n)u(n) (E.8) 

(all quantities assumed to be >0} , and we wish to recover x from z. If we take 
the logarithm of both sides 
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5(n) = logz(n) = logx(n) + logu(n) 


(E.9) 


we can use.' Ixnear technxques to fxlter yxeldxng the output ri(n) , and 

we then obtaxn the desxred fxltered versxon as 


x(n) 


ri(n) 

e 


(E.IO) 


For applxcatxons of multxplxcatxve homomorphxc processxng, we refer the 
reader to [C“l,E-2] . in the case xn whxch log and log (u) are Gaussxan random 
varxables — x,e, when x and u are lognormal varxables [E-9, 12-16, 20] — the 
fxlterxng of 5(n) xs sxmply a Kalman fxlter (thxs result xs developed thoroughly 
in [E-20] ) , The contxnuous-txme versxon of thxs multxplxcatxve noxse model has 
been studxed xn [E-2] , and xts stochastxc analog was developed xn tE-9] , Let 
us examxne thxs case at some length. Let w(t) be a two-dxmensxonal Gauss- 
Markov process satxsfyxng theequatxon 

w(t) = Aw(t) + v(t) (E.ll) 

where v(t) xs a two-dxmensxonal whxter^'noise process 


E(v(t))=0, E(v(t)v{T)) = Q5(t-T) 


(E.12) 


Suppose we transmxt the "frequency modulated sxgnal" 

t 


,1 


x(t) = exp 




(s)+]w (s)])ds 

A Y 


(E.13) 


Due to some effect (e.g. atmospherxc turbulence [E-21] ) , the recexved sxgnal 
xs corrupted by multxplxcatxve noxse 


^ Here we are allowxng both the usual type of modulatxon on the phase and a 
"homomorphxc" modulatxon on the amplxtude. 




-272- 


r(t) = U(t)x(t) 


(E.14) 


where 


V(-t) = exp(ri^{t)+3ri2 (t) ) 


(E.15) 


and Ti IS a two dimensxonal Brownian motion process 
E{rt(t))=0, E(f|(t)n(T)) = R6(t-T) 


(E.16) 


Because of the continuity of r(t), there is no difficulty in taking the 
complex logarithm (see [E-S] ; essentially, continuous monitoring of phase 
allows one to unravel it and determine the number of revolutions, as well as 
the value of the phase modulo 2 tt) . in this case r (t) is equivalent to the 
obs ervations 


d^^(t) = w^(t)dt + dri^(t) 
= W2(t)dt + dri2(t) 


(E.17) 


Using standard Kalman filtering techniques, we can obtain the least squares 

A A 

estimates w^(t) and (t) . However, the best estimate of x is not 


e:q) 


■c 

/ 


A A 

[w^(s)+3W2 (s) Ids 


essentially because the integral of a best estimate is not the best estimate 
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of the xntegral. However, in this case>we can obtain the desired estimate as 
follows : 


Let 


Pj^Ct) 



w^(s)ds 


r 


p2 (t) 



W2(s)ds 


(E.18) 


Then by adjoining these integrals to w^ and W 2 to form a four-dimensional 
"state?, we can again design a Kalman filter (with measurements (E,17)) and 
obtain the best estimates w (t) , w^ (t) , p (t) , and p (t) . Then the desired 
estimate is 


x(t) = exp(p^(t) + jp^Ct)) (E.19) 

The details of this development are given in [E-9,12]. Also, in these refe- 
rences it IS shown that the solution of the discrete-time version — i.e. 
when we observe only r(kA), where r is as in (E.14) — is much more difficult, 
essentially due to the ambiguity in the complex logarithm which cannot be 
resolved in this case. 

In digital signal processing, multiplicative homomorphic systems represent 
only one half of the picture. As discussed in [C-1,E-1,2] one can study systems 
in which vector addition is the operation of convolution and multiplication by 
an integer n corresponds to convolution of a signal with itself n times 
(multiplication by a non-integer is a generalization of this [E-1,2,22]). The 
key to the development of homomorphic filtering techniques for convolutional 
noise IS the z-transform of signals. Let X be a vector space of signals under 
the operations of convolution as vector addition and scalar multiplication as 
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defxned above. Then we have that followxng transform relations 
(n) (z) X^ (z) 

(a*x^) (n> (z)^ 

and we see that homomorphxcvconvolutxon systems look Ixke multxplxcatxve 
homomorphxc systems xn the frequency demain, Thxs allows one to develop a rather 
complete theory of convolutxon-homomorphxc fxlterxng, and we refer the reader 
to [C-l,E-2,22] for detaxls. Technxques such as "homomorphxc deconvolutxon" 
have found applxcatxon xn speech analysxs [C-l,E-3 ] , dereverberatxon of sxgnals 
such as those arxsxng xn sexsinxc applxcatxons [C-l,E-2,22,23] , and xn several 
other dxscxplxnes (see [C-l,E-2]). 

A recent dxrectxon of research xn control and estxmatxon theory has been 
study of bxlxnear systems [C-26,E~5-16] , and the multxplxcatxve homomorphxc 
system (E.ll) - (E.16) represents one of the sxmplest examples, Consxder (E.13). 

We Ccin easxly obtaxn a stochastxc dxfferentxal equatxon for x: 

x(t) = (Wj^(t}+ 3 W 2 (t) )x(t) (E.21) 

If we regard w , w as xnputs — controls and/or noxses — we see that the 
rxght-hand sxde of (E.21) consxsts of a product of xnputs and the state — x.e. 
xt xs a bxlxnear functxon of the two. Generalxzxng thxs, we obtaxn the class of 


bxlxnear systems 
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N 

x(t) = ^ A u (t)j'x(t] 

1=1 


(E.22) 


Where the are known nxn, possibly complex-valued matrices, the are scalar 
inputs, and x is either an n-vector or an nxn matrix. 

The question of the control, estimation, and stability of bilinear systems 
such as {E.22) has received a great deal of attention in the recent past and 
has applications in a wide range of disciplines (see I {C-26,E-5, 12 ,14-16,24] ) . 
We will not examine the control or stability issues here and refer the reader 
to the references. Rather, we content ourselves with a brief look at the 
estimation problem in order to uncover some of the main issues in "bilinear 
signal processing". Note that in the scalar case (E.21) , one can readily 
obtain a representation for x(t) of the form (E.13). However, in the vector 
case this is not true in general. In fact, the solution of (E.22) has the re~ 
presentation 


xCt) 


exp 


V + 



u^“(s)ds 


x(0) 


(E.23) 


if and only if all of the matrices Aq,A^,...,A^ commute (a very restrictive 

condition) . In fact, the commutativity or noncommutativity properties of 

these matrices plays a central role in the analysis of bilinear systems, and 

the introduction of concepts from the theory of Lie algebras and Lie groups 

allows one to study these systems in great detail [E-5,6,8-17,25,26] . 

Let us see what this noncommutativity can do by examining a problem that 
\ 

IS motivated by (Eoll)- (E.16) . If we examine those equations and consider 
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only the phse effects — i.e. and ri^ — we see that this problem is the es- 
timation of a phase given noisy measurements of that phase. By performing a 
transformation on the measurement, we obtain a noisy measurement of the angular 
frequency. From we can apply standard Kalman filtering techniques to estimate 
the angular frequency and its integral, and then the desired phase estimate is 
]ust the complex exponential of the estimate of the integral. A natural ex- 
tension of this problem is the consideration of rotation in three dimensions. 

We follow [E-5, 12, 14-16] . Suppose we have a satellite, equipped with an 
inertial platform. The orientation of the satellite with respect to an inertial 
frame can be specified by coordinatizing a body-fixed orthonormal basis in 
inertial coordinates. The resulting set of three 3-vectors is called the 
direction cosine matrix X(t) and it has the property 


X'(t)X(t)=I, det X(t)=l 


(E.24) 


Let w(t) be the angular velocity of the body with respect to inertial space, 
coordinatized in the body frame. Then, it is known that the evolution of the 
direction cosine matrix is described by the bilinear equation 


where 




X(t) 


2 

1=1 


R w (t) 
1 1 


X(t) 


1 

o 

J 

o 

o 


o 

0 

1 

H 

t 


o 

H 

O 

0 0 1 
0-10 

' ^2 = 

0 0 0 
10 0 

' ^3 = 

-1 0 0 

0 0 0 





— ^ 


(E.25) 


(E.26) 
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Suppose that our only observatxon of satellxte attxtude xs from the xnertxal 
platform — x,e. we observe the dxrectxon cosxne matrxx M(t) of the body wxth 
respect to the platform, whxch xs supposed to remaxn fxxed xn xnertxal space 
(xn whxch case It=X) , However, because of varxous errors (e.g. gyro drxft) , the 
platform drxfts, and our actual observatxon xs 


M(t) = X(t)V{t) 


(E.27) 


where the "platform mxsalxgnment term", V{t) xs the dxrectxon cosxne matrxx 

of inertxal space with respect to the platform. As descrxbed xn (E-14) , thxs 

2 ■ 

can be modeled by a bxlxnear equatxon of the form 


V(t) = V(t) 


r 3 


I 

■ 5=1 


R V (t) 
X X 


(E.28) 


where the v^ represent gyro drxft and for sxmplxcxty are taken to be whxte. 

The reader should now compare (E.25 )“(e. 28) wxth (E.13)-{E.l5) (usxng 
(E,21) and an analogous equatxon for u) . We see that we have a dxrect analog 
of the phase — x.e, one-dxmensxonal rotatxon — problem, xncludxng a 
multxplxcatxve noxse model (E.27) (see [E-9] to see that xf only one w^ and the 
correspondxng v^ are nonzero, then thxs problem precxsely reduces to the phase 
estxmatxon problem) . Suppose we now assume that w obeys an equatxon such as 
(E.ll). Then, essentxally by the matrxx equx valent of the complex logarxthm 
(agaxn we have no "mod 2ir" dxffxcultxes because of our contxnuous observatxon) , 


Technxcally, one must xnclude a "correctxon term" xnto (E.28) xf one xnterprets 
xt as an Ito stochastxc equatxon. Thxs xs not dxffxcult but xt does obscure our 
point wxth technxcalxtxes (whxch certaxnly are very xmportant) . The reader xs 
referred to [E-16,27] for the detaxls. Note that (E.28) can be xnterpreted rxgo- 
rously if one uses Stratonovxch calculus [E-8,14]. 
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we can essentxally dxfferentxate M(t) to obtaxn noxsy measurements of w cor- 
rupted by the gyro drxfts problem here xs somewhat more complex 

than the earlxer one xn that one must take care xn-usxng stochastxc calculus 
(see footnote 2 } and, more xmportantly, because rotatxons xn three-dxmensxons 
do not commute (see [E-16,27]), However, as derxved xn [E-16] , one can carry 
the analysxs through to obtaxn a measurement equatxon of the form 

z(t) = w(t) + M(t)v(t) (E.28) 

where , Note that the effect of the gyro drxfts on our measure- 

ment of angular velocxty depends upon our attxtude (thxs effect can be removed 
xn the one-dxmensxonal rotatxon case) . 

Usxng (E,28) , we can desxgn a Kalman fxlter to estxmate w. However, we 
run into a problem xn' estxmatxng X, Recall that xn the one-dxmensxonal 
problem, we augmented the state of our Kalman fxlter wxth the estxmate of the 
xntegral of w, but xn the three dxraensxonal case the xntegrals of components 
of w are not simply related to X, agaxn because of the noncommutatxvxty of 
rotations xn three dxmensxons« In fact, in thxs case the problem of optxmal 
estimatxon of X xs xnfxnxte-dxmensxonal [E-14] , Thus, xn the one-dxmensxonal 
case we obtaxn a decomposxtxon much Ixke (E.4) . We can convert our multxplxcatxve 
process xnto a Ixnear one and can operate on xt wxth optxmal linear technxques. 
However, the re-xngectxon of the resultxng fxltered process becomes extremely 
complex. One must use approxxmate methods (see [E-12,14]) except in specxal cases. 
The case when all of the commute xs much Ixke the scalar case and xnvolves 
lookxng at the xntegrals of certaxn quantxtxes [E~9,12]. In addxtxon, xf the A^ 
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obey certain {somewhat less restrictive] noncommutativity relations one can 
obtain a finite dimensional optimal procedure by considering several types of 
iterated integrals (see [E-14,15,17] for details). 

Let us say a few more words about the relationship between homomorphic 
filtering (HF) and bilinear signal processing (BSP) . Recall that HF is based 
on the existence of certain algebraic properties between input functions and 
output functions — i.e, the validity of a superposition rule. Also, in HF, 
one designs a filter consisting of three parts — a "preelection" system, which 
"unravels" the signals so that one can use a linear filter as the second part, 
followed by an "injection" of the resulting process to yield the desired output. 

In the scalar example of BSP, as described in (E.11)-(E.19) , we obtain a 
system of exactly this form — i.e. a HF (logarithm-linear (Kalman) filter- 
ej^onential) — , and as we mentioned earlier, we obtain essentially the same 
results for the model (E.22) , (E.27) if the commute. However, in the general 
case we cannot obtain the entire picture. Specifically, we can-"unravel" the 
signal and can perform linear (and perhaps nonlinear [E-14,15,17] ) processing, 
but the re-injection process is much more difficult. Perhaps one of the keys 
to the difference between HF and BSP is the difference in the starting point 
of the two theories. In homomorphic filtering the fundamental assiimption involves 
the algebraic structtire of the relation between input trajectories and output 
trajectories (superposition). For bilinear systems analysis, the starting point 
IS (E.22) , which can be seen to impose an algebraic (multiplicative) restriction 
on the time rate of change of the state or output — i.e., in some sense, (E.22) 
represents an "incrementally homomorphic" model, in which the fundamental 


assumption involves algebraically compatible dynamics (as opposed to input- 
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output relation). In the case when the coniinute (E.22) also yields a 
multiplicative 1/0 relationship, and in the other special cases considered in 
[E-14,15,17] , the restrictions on the A^ yield other tractable I/O relations, 
but in these cases the optimal filters are not homomorphic (since following the 
unraveling of the received signal, we perform a nonlinear filtering operation) . 
We should note, however, that in the general case, the algebraic structure of 
(E.22) still allows one to perform a great deal of analysis, and we refer the 
reader to the references for details (see, for example, Ee-5,10]). 

We note that the use of algebraic and geometric concepts and techniques 
to study systems with algebraically compatible dynamics or input-output relations 
has increased greatly over the past few years as new theories and applications 
have been uncovered [e- 5 through 19,24 through 36] . Recently, certain nonlinear 
systems having Volterra series representations have been studied with great 
success [E-6, 10, 11, 14, 17, 33, 34] using techniques and ideas that have grown out 
of the study of bilinear systems [E-5, 7], In addition, motivated by many of the 
same issues that motivated pppenheim's study [E-1] of generalized superposition, 
several researchers [E-18,19,28-34] have examined ' systems whose state dynamics 
possess some, but not all, of the algebraic structure of linear systems. Also, 
several researchers [E— 35,36] have studied controllability, realizability and 
related properties for systems which possess particularly nice input/output 
descriptions, much along the lines of Oppenheira's generalized superposition. 

By performing such analyses, new insights have been shed on the properties of 
linear systems, and many of the ^werful tools of linear system analysis are 
being extended to other dynamical systems, establishing the foundations -for a 
synthesis and analysis theory for special classes of nonlinear systems. It is 



this key idea the use of algebraic tools to synthesize and analyze nonlinear 
systems with structure •— that is the magor common theme of the nonlinear systems 
research in the two disciplines. 
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Cbnclttding Remarks 

In this report we have examined a mimber of broad research areas 
that have attracted workers in two disciplines — digital signal pro- 
cessing and control and estimation theory. Oar goal has been to explore 
these areas in order to gain perspective on relationships among the 
questions asked, methods used, and general philosophies adopted by 
researchers in these disciplines. Upon imdertaking this study it was 
our feeling that such perspective would be extremely valuable in pro- 
moting collaboration and interaction among researchers in the two 
fields. Upon concluding this study, we think that our initial feelings 
have been thoroughly substantiated. Not only are there numerous 
examples of questions in one discipline that can benefit from the point 
of view of the other, but also we have found a number of new issues 
that naturally arose from combining the two points of view. 

Each of the disciplines has its own distinct character, and 
clearly these will and should be maintained. On the other hand, each 
discipline can gain from understanding the other. State space methods 
have their limitations, such as in specifying useful digital algorithms 
and structures. On the other hand, state space methods provide ex- 
tremely powerful computer-aided algorithms for noise analysis, optimal 
design specification, etc. State space “ideas also allow one to con- 
sider multivariable and time-varying systems. All of these aspects of 
state space theory may prove of value to people involved in digital 
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signal processing. On the other side, researchers in digital filtering 
have answered many crucial questions related to turning design specifi- 
cations into implementable designs. The deep understanding that workers 
in digital signal processing have concerning the problems of digital 
inplementation is something that researchers in control and estimation 
would do well to gain. Thus it seems clear that a mutual understanding 
will prove beneficial to all concerned. 

We have raised numerous questions and have speculated on various 
possibilities throughout this report, and it would be an impossible 
task to summarize these questions and speculations here. Rather, we 
will mention only one or two questions from each area. These may not 
prove to be the most exciting or promising problems, but we feel that 
they are representative and do summarize the tone of this report. 

A. Stability Analysis — What is the effect on overall 
stability of the finite arithmetic constraints of a 
digitally implemented feedback controller? 

B. Parameter identification. Linear Prediction, Least' 

Squares, and Kalman Filtering — Can state space and 
recursive filtering methods be applied to model and 
identi:^ time-varying models of speech? Do stochastic 
realization and recursive maximum likelihood methods 
offer useful tools for pole-zero modelling of speech? 
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C. Synthesis, Realization, and Implementation — Can state 
space realization and filter structure concepts be combined 
to obtain useful realizations for multivariable or time- 
varying digital filters? Can state space noise analysis 
methods aid in roundoff analysis of digital filters? Can 
we develop design technique (e.g. for feedback controller 
design) that directly take the constraints (storage, speed, 
word length) of digital implementation into account? 

D. Multiparameter Systems, Distributed Processes, and Random 
Fields — What role do state space methods (if they exist) 
play in the analysis and synthesis of 2-D filters? Can 
Lyapunov theory (if it exists) aid in understanding the 
effects of finite arithmetic in 2 -d systems? What role 
should 2 -d recursive estimation and detection techniques 
have in image processing? Can 2-D concepts provide any 
insight and/or results for distributed parameter, space- 
time, or decentralized control problems? 

E. Some Issues in Nonlinear System Analysis! Homomorphic 
Filtering, Bilinear Systems, and Algebraic System Theory — 
Is this algebraic point of view a useful approach to the 
analysis and synthesis of nonlinear systems and filters? 
Homomorphic filtering has found widespread application; can 
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the same be saxd for other algebraic concepts? 

Whether any of these issues or any of the others raised in this 
report has useful answers is a question for the future. It is our 
feeling that many of them do, and it is our hope that others will think 
so as well. 
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Appendix I t A Lyapunov Function Argument for the Limit Cycle Problem 
in a Second-Order Filter 

Consider the second-order filter in Figiire Ap.l. The ideal (un- 
driven) dynamics of this filter are 



Let us look for a quadratic Lyapunov function 


V{x) = x'Bx, B = 



(Ap.4) 


In fact, let us assume that B proves the asymptotic stability of (Ap.l) — i.e. 
(see Section A) 
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B>0, B-A*BA>0 

We compute 


(Ap.5) 


AV(z) = F'(Az)BF(Az) - z'Bz 

= (F' (Az)BF(Az) - z'A’BAz) 

+ (z'A'BAZ“2'Bz) 

From this, it is clear that we will have asymptotic stability if 


(Ap.6a) 


(Ap.6b) 


F’(AZ}BF(AZ) - z'A*BAz£0 

or if the somewhat stronger condition 
F’(€)BF{D - C'B|<0 


Vz 


VC 


(Ap.7) 


(Ap.8) 


Equation (Ap.8) is equivalent to 


11 




^ 1*’^2 


(Ap.9) 


Using the fact that [q(^^) 1^| C-j^ | » we can see that (Ap.9) holds if and only 

if must find conditions on A such that there exists a diagonal 

B, satisfying (Ap.5) — i.e.. 


This IS the criterion used by Willson [A~2] for the overflow problem. 

This IS not stronger if A is invertible, which is true if and only if bj^O. 


-289- 


b >0, b >0 B-A»BA>0 
Xx 


(Ap.lO) 


Equation (Ap.lO) can be further reduced to the following equations (after we 
normalize which we can do simply by scaling B) : 


0<b22<(l-a^) 


(Ap.lO) 


We can rewrite the second inequality as 


(a^-b^-l)b <-b^ -b^ 

' 22 22 


(Ap.ll) 


2 2 

and the possibilities are given in Figure Ap.2, if (a -b -1)>0, either we 
have no region in which (Ap.ll) holds (b) or the region is for negative values 
of b ^2 (<3) r which biolates the first inequality in (Ap.ll) . Thus, we must 
have 

aW-KO (Ap.l2) 


and in fact, we must have case (c) , which means that there must be two real 
solutions to (Ap.ll) when the inequality is made into an equality. Some 
algebraic manipulations yield the inequalities 


0<b22<(l-a ) 

A 2 2 

CT(a,b) = a -b -1<0 

A 2 2 2 2 

p(a,b) = (1-a +b ) - 4b >0 


- g(a,b)-Vp(a,b) ^ ^ - g(a,b) + t/p (a,b) 

2 ^^22 2 


(Ap.l3a) 

(Ap.l3b) 

(Ap.l3c) 

(Ap.l3d) 
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Using (Ap.l3b) , (Ap,13c) xs equivalent to 
1-a^+b^ > 2jb| 

or , (Ap.l4) 

la|<|l-lbll 

and under thxs condxtxon, both {Ap.l3b) and (Ap.lSc) hold. Then, we wxll have 
that we can fxnd a value of xf and only xf the xnequalxtxes (Ap.l3a) and 
(Ap.lSd) overlap, Combxnxng these, we find that the regxon of (a,b) -space 
for which we can use thxs technxque to prove stabxlxty xs 

|ai< 1-jbl , lb|<l (Ap,15) 

whxch xs xllustrated xn Fxgure Ap.3 The trxangle xs the regxon xn whxch the 
Ixnear system (Ap,l) xs asymptotxcally stable and the cross-hatched area xs 
(Ap.15) . In the remaxnxng part of the trxangle, one must use a non-dxagonal 
B, and this technxque will not work, Thxs xs not to say that Lyapunov func- 
txons can't be fo\ind that wxll prove stabxlxty xn these regxons of (a,b) space 
xn the one magnxtude truncator case, but rather that one wxll have to work 
harder to fxnd them xf they exxst (exther by workxng dxrectly wxth (A. 6a) or 
by lookxng for nonquadratxc Lyapunov functxons) , Thxs derxvatxon hopefully 
xllustrates the type of argument that one can make usxng Lyapunov functxons 


and also the dxffxcultxes and the Ixmxtatxons of the technxque. 
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Appendix 2 t The Discrete Fourier Transform and Circulant Matrices 

Circulant matrices appear in several places in Section D. In this 
appendix we indicate some of their properties. Suppose we have a block 
circulant matrix A 


A = 


- ^0 \-l 

\ \ ^2 

* • • 

• • • 

* • • 

* N -1 \-2 \ 


(Ap.l6) 


where each A^ is PxQ. Consider the equation 


y- = Ax 


(Ap.l7) 


where y is an NP vector, partitioned into P-vectors 


y* = y^_^) 


(Ap.18) 


and X IS an NQ-vector, partitioned into Q-vectors 


X* = (x x'^3_) 


(Ap.l9) 
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Combining (Ap.l6)- (Ap.l9) we obtain 
N-1 

y, = E 

3=0 ^ 

where all subscripts are to be interpreted modulo N. Hence, the right-hand 
side of (Ap,17) IS nothing more than a cyclic convolution. Let us take the 
DFT of the sequences where, for example 

* 

N-1 j 

yU) = ^ y w , jl=0,...,N-l (Ap.2i) 

1=0 


where 


w = 
N 


(Ap.22) 


In the transformed domain, we now have N" decoupled sets of equations 


y(Jt) = A(il)x(ii) , £=0 ,...,n- 1 (Ap,23) 

and we have effectively block diagonalized the block circulant matrix A, 

If we also have that P=Q and that each of the A^ is circulant, then each 
of the A(Ji) is circulant, and we can diagonalize each of them by iterating 
the above development. Thus we can use the FPT to diagonalize A. In 
addition, if we write 
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y'= (y(0) ’ ,...,y(N-l) ') = Ty (Ap.24) 

x'= (x(0) ’ , . . . ,x (N-1) ’ ) = Sx (Ap.25) 

(here S=<l? xf P=Q) , we observe that 

TAS' = diag (A(0) , . . . ,A(N-1) ) (Ap,26) 


Therefore, xn thxs case, the calculatxon of Ms, where M xs the matrxx of 
exgenvectors of A, can be performed usxng the ETT. 
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